Oops! AI Made a Legal Mistake: Now What? AI Hallucinations, Professional Responsibility, and the Future of Legal Practice

October 21, 2025

1. Introduction
The integration of artificial intelligence (“AI”) into legal practice has been rapid, uneven, and fraught with risks. Generative AI, particularly large language models (“LLMs”) such as ChatGPT, can accelerate legal drafting, summarise vast bodies of law, and support case preparation. Yet, the promise of efficiency is undermined by a fundamental flaw: these systems frequently generate fabricated legal authorities, so-called “hallucinations.”
An AI hallucination occurs when an LLM produces a plausible but wholly fictitious case, statute, or doctrinal proposition. In law, this is particularly perilous. Unlike in creative writing, where invention may be harmless, in litigation a fabricated neutral citation or judicial dictum can amount to misleading the court. Indeed, even inadvertent reliance on hallucinated authorities may breach duties of candour and competence, exposing practitioners to disciplinary sanction, civil liability, or reputational damage.
The Bar Council of England and Wales warned in January 2024 that blind reliance on AI risks incompetence or gross negligence.¹ The Law Society’s Generative AI: The Essentials (May 2025) went further, presenting a checklist that effectively codifies AI literacy as a baseline professional competence.² The Judicial Office (April 2025) formalised the asymmetry between litigants-in-person (“LIPs”) and regulated lawyers, making explicit that courts will forgive the former but expect near-perfection from the latter.³
The jurisprudence developing across the United States, Canada, Australia, and the United Kingdom illustrates a consistent judicial refrain: responsibility for accuracy never transfers to the machine. This article undertakes a deep comparative analysis of those cases, situates them within the evolving professional guidance, explains why hallucination is a structural, not accidental, feature of LLMs, and explores the legal and ethical implications.
2. Comparative Case Law Analysis
(a) Mata v Avianca, Inc. (US, 2023)
The case of Mata v Avianca, Inc. in the Southern District of New York was the watershed moment. Attorneys for the claimant submitted a brief citing six entirely fictitious cases generated by ChatGPT.⁴ the authorities appeared authentic, complete with plausible neutral citations and formal judicial reasoning. Only when opposing counsel and the court attempted to locate the decisions did the fabrication come to light.
The court-imposed sanctions under Rule 11 of the Federal Rules of Civil Procedure, noting that the attorneys had made “false and misleading statements” and failed to exercise due diligence. ⁵ Crucially, the court rejected the suggestion that AI’s sophistication absolved the lawyer: “Technological assistance cannot supplant the duty of counsel to ensure accuracy.” ⁶
The case illustrates a principle now echoed globally: lawyers cannot outsource their professional judgment to algorithms.
(b) Harber v HMRC [2023] UKFTT 1007 (TC)
The UK’s first brush with hallucination came in Harber v HMRC. An LIP appealing a tax decision submitted nine non-existent tribunal judgments generated by ChatGPT. ⁷ Each case was presented in convincing professional style, with neutral citations and plausible reasoning.
The First-tier Tribunal, while not sanctioning the appellant, stressed that reliance on fabricated cases, whether deliberate or inadvertent, undermines judicial integrity. Importantly, the tribunal distinguished between the responsibilities of LIPs and professionals: whereas an unrepresented party may be indulged for ignorance, counsel would not.
The judgment foreshadowed the Judicial Office’s later guidance: courts will adopt a sympathetic approach with the public, but the same leniency is unavailable to regulated professionals.
(c) Zhang v Chen (2024 BCSC 285, Canada)
In Zhang v Chen, a lawyer in Vancouver submitted two fabricated authorities generated by ChatGPT in a child custody dispute. ⁸ When opposing counsel raised concerns, the court could not locate the cases. Counsel admitted the error and apologised, arguing no intent to mislead.
The judge, while declining to impose special costs, nonetheless warned that reliance on fictitious authorities could amount to an abuse of process. ⁹ Though the outcome spared the lawyer financial penalty, reputational harm was unavoidable: being associated with “fake cases” severely undermined professional credibility.
This illustrates a key dimension of risk: even absent formal sanction, reputational damage can be career-defining.
(d) Handa & Mallick [2024] FedCFamC2F 957 (Australia)
Australia’s Handa & Mallick involved solicitor Dayal, who submitted hallucinated authorities generated by AI in a family law matter. ¹⁰ The Victorian Legal Services Board disciplined him, prohibiting him from handling trust money or practising unsupervised for two years.
Dayal argued he had not appreciated the risk of hallucination. The tribunal was unsympathetic: ignorance of a tool’s limitations is no defence. ¹¹ The case marks an inflection point: technological literacy is now part of the duty of competence.
(e) Ayinde v London Borough of Haringey & Al-Haroun v Qatar National Bank [2025] EWHC 1383 (Admin)
In this joined High Court judgment, Saini J delivered a strong warning. In Ayinde, an LIP relied on hallucinated authorities. In Al-Haroun, by contrast, professional representatives submitted material displaying signs of AI use without proper verification. ¹²
Saini J emphasised that professional responsibility is non-delegable: “It is no answer to say that the citation came from an AI tool. Counsel bears personal responsibility for every authority placed before this court.” ¹³ The judgment represents a landmark articulation of principle: failure to verify AI outputs may itself amount to misconduct.
(f) UKIPO Trademark Appeal (BL O/0559/25, June 2025)
The UK Intellectual Property Office encountered hallucination in a trademark appeal. Both an LIP and a regulated trademark attorney relied on inaccurate AI outputs, including fabricated authorities. ¹⁴ The tribunal distinguished sharply between the two. For the LIP, the error was regrettable. For the attorney, it was serious misconduct: “A regulated professional is under a duty to exercise independent judgment and cannot abdicate that responsibility to an algorithm.” ¹⁵
This distinction echoes Harber and reflects the growing consensus: public indulgence for LIPs, strict accountability for lawyers.
(g) HMRC v Gunnarsson [2025] UKUT 00247 (TCC)
The Upper Tribunal once again confronted hallucinations when an LIP relied on fabricated tax cases. ¹⁶ The tribunal expressed sympathy, declining sanctions, but highlighted systemic risks: if hallucinations proliferate, courts may be forced to verify every citation, undermining efficiency.
This judgment reveals a broader concern. Beyond individual cases, routine hallucinations threaten to impose systemic burdens on the justice system, shifting costs and responsibilities away from lawyers and onto judges.
3. Professional Guidance: From Caution to Codification
Professional bodies in England and Wales have evolved from tentative warnings to codified standards.
Bar Council (Jan 2024)
The Bar Council warned that blind reliance on AI could amount to incompetence or gross negligence. ¹⁷ By framing misuse in terms of negligence, the Council sent a clear signal to insurers and disciplinary bodies: reliance on hallucinations may breach professional indemnity obligations. This transforms AI misuse from a matter of “best practice” into one of potential liability.
Law Society (May 2025)
The Law Society’s Generative AI: The Essentials codifies a threefold checklist: verification, disclosure, and record-keeping. ¹⁸ The guidance states that practitioners must (1) independently verify AI outputs against reliable sources, (2) disclose AI use where it materially affects client advice or submissions, and (3) retain records of AI interactions.
This marks a regulatory shift: AI literacy is no longer aspirational but part of the minimum professional standard of competence.
Judicial Office (Apr 2025)
The Judicial Office guidance crystallises the asymmetry between LIPs and professionals. ¹⁹ Courts may forgive LIPs for good faith reliance on AI but will not tolerate similar conduct from lawyers. This codifies what Harber and Ayinde implied: the profession is held to a higher standard.
SRA and BSB
The Solicitors Regulation Authority and Bar Standards Board have reiterated that AI is merely a support tool. ²⁰ Responsibility for accuracy remains personal and non-delegable. The duties of candour to the court and competence in representation cannot be outsourced.
These regulators align with international counterparts, reinforcing the principle that technological assistance does not absolve professional responsibility.
4. Technical Causes of Hallucination
Hallucinations in large language models (LLMs) are not best understood as software “bugs” but as emergent properties of probabilistic text generation. They arise from the fundamental architecture and training constraints of these systems.
1. Predictive Generation
LLMs are autoregressive: they generate text by predicting the most likely token sequence given prior context. This means the model optimizes for plausibility within a linguistic distribution rather than truthfulness. Thus, when prompted for legal authorities, the system produces text that resembles legal reasoning but without any epistemic awareness of whether a cited authority exists (Bommarito and Katz 2023). ²¹
2. Training Data Gaps
Because major legal databases such as Westlaw and LexisNexis are proprietary and excluded from open training corpora, the model often lacks access to authoritative, comprehensive case law. In such scenarios, the model extrapolates based on statistical patterns it has learned from partial or publicly available sources, effectively “filling in” with plausible but fabricated authorities (Yeung 2024). ²²
3. Jurisdictional Conflation
The common law tradition’s shared terminology across jurisdictions (e.g., “Supreme Court,” “Court of Appeal”) creates embedding overlaps within the model’s semantic space. Without explicit jurisdictional metadata, the model blends Canadian, UK, and US precedents into composite fictions, generating citations or doctrines that appear legally coherent but in fact straddle incompatible legal systems (Susskind 2022). ²³
4. Stylistic Illusion
LLMs excel at capturing stylistic features such as neutral citations, case naming conventions, and judicial prose. This allows them to produce text that looks authoritative: fabricated cases may be formatted with the precision of actual legal opinions. The surface-level fidelity of style deceives readers into treating such text as authentic, even where the underlying content is spurious (Bommarito and Katz 2023). ²¹
5. Lack of Verification
Unlike systems integrated with retrieval-augmented pipelines, current general-purpose LLMs generate responses without real-time validation against authoritative legal sources such as BAILII. This absence of a verification loop means that once the model outputs a high probability but fictitious case, there is no corrective mechanism to check or suppress it (Yeung 2024). ²²
As Bommarito and Katz argue, hallucination is “an inevitable consequence of probabilistic language modelling, not a correctable anomaly.” ²⁴ The principal danger lies not in the generation of such text itself, but in practitioners mistaking linguistic plausibility for epistemic reliability.
5. Legal and Ethical Implications
Regulatory Sanctions
Courts and regulators have imposed sanctions ranging from suspension (Handa) to warnings (Zhang). Misleading the court, even inadvertently, may breach the core duty of candour. ²⁵
Professional Negligence
Clients harmed by reliance on hallucinated law may sue. Insurers may refuse cover if practitioners failed to verify outputs. ²⁶
Reputational Fallout
Association with AI mistakes, even without sanctions, can irreparably damage credibility.
Abuse of Process
Judges have signalled that submitting hallucinated authorities may amount to abuse of process, exposing practitioners to personal costs orders. ²⁷
Asymmetry: LIPs vs Lawyers
While courts show leniency to LIPs (Harber, Gunnarsson), they demand perfection from professionals (Ayinde, UKIPO).
Confidentiality Risk
AI tools can inadvertently breach confidentiality, with data supplied by users potentially becoming accessible to others through subsequent queries or system vulnerabilities.
Ethical Duties
The BSB Handbook requires barristers to act with honesty and integrity and to exercise competence. ²⁸ Reliance on unverified AI breaches both.
Systemic Risks
Routine hallucinations threaten judicial efficiency, requiring courts to fact-check every citation. This undermines the very efficiency AI promised.
6. Conclusion: From Structural Risk to Professional Responsibility
The global trajectory of case law, professional guidance, and regulatory standards demonstrates a clear and inescapable lesson: AI does not and cannot shift responsibility away from the lawyer. Hallucinations are a structural feature of probabilistic language models, not an incidental malfunction, and their recurrence in litigation has confirmed that courts will forgive ignorance from litigants-in-person but not from regulated professionals.
If the integration of AI into legal practice is to be sustainable, the profession must respond not with avoidance but with structured adaptation. Three interlocking recommendations emerge.
First, verification must be non-negotiable. Every authority, citation, or doctrinal proposition generated with the assistance of AI must be independently checked against reliable sources such as BAILII, Westlaw, or LexisNexis. Failure to verify is already being treated as negligence or misconduct, and this principle will only harden as jurisprudence develops.
Second, disclosure and record-keeping should become professional norms. Where AI materially contributes to client advice or court submissions, lawyers should disclose its use and retain verifiable records of their interactions. This not only protects the practitioner but also facilitates transparency with clients, regulators, and courts.
Third, confidentiality must be safeguarded. Practitioners should avoid inputting sensitive or privileged client data into publicly accessible AI systems. Without robust safeguards, confidentiality breaches risk undermining both professional privilege and public trust in the justice system.
Beyond these immediate safeguards, the long-term solution may lie in technical and institutional innovation: retrieval-augmented generation to constrain outputs, AI systems trained on verified legal corpora, and regulatory sandboxes to test safe deployment in practice. But the responsibility for accuracy, candour, and integrity will remain personal and non-delegable.
The legal profession thus faces a choice. AI can either become a trusted research assistant or a persistent liability. The difference will be made not by the technology itself but by the standards of competence, diligence, and ethical responsibility that lawyers bring to its use. Courts have made their position plain: the machine may hallucinate, but the advocate must not.
References
- Bar Council, AI and the Bar: Guidance (Jan 2024).
- Law Society, Generative AI: The Essentials (May 2025).
- Judicial Office, AI and the Courts: Guidance (Apr 2025).
- Mata v Avianca, Inc No 1:22-cv-01461 (SDNY, 2023).
- ibid, sanctions ruling, para 23.
- ibid, para 28.
- Harber v HMRC [2023] UKFTT 1007 (TC).
- Zhang v Chen 2024 BCSC 285 (British Columbia SC).
- ibid, para 42.
- Handa & Mallick [2024] FedCFamC2F 957.
- ibid, disciplinary ruling, para 19.
- Ayinde v London Borough of Haringey & Al-Haroun v Qatar National Bank [2025] EWHC 1383 (Admin).
- ibid, [2025] EWHC 1383, para 51.
- UKIPO Trademark Appeal (BL O/0559/25, June 2025).
- ibid, tribunal judgment, para 34.
- HMRC v Gunnarsson [2025] UKUT 00247 (TCC).
- Bar Council (n 1).
- Law Society (n 2).
- Judicial Office (n 3).
- SRA, Artificial Intelligence and Professional Responsibility (2024); BSB, Regulatory Statement on AI (2024).
- S Bommarito and M Katz, ‘Hallucinations in Large Language Models’ (2023) 20 Yale J L & Tech 1.
- K Yeung, ‘AI in Legal Research: Promise and Peril’ (2024) 47 L & Soc Rev 112.
- R Susskind, Tomorrow’s Lawyers (3rd edn, OUP 2022).
- Bommarito and Katz (n 21) 14.
- Handa & Mallick (n 10).
- D Howarth, ‘Professional Negligence in the Age of AI’ (2025) 141 LQR 217.
- Zhang v Chen (n 8).
- Bar Standards Board, BSB Handbook (2024 edn).