Methodological premise

This contribution does not aim to offer a civil procedural analysis of the judicial decisions that will be discussed — an area in which extensive legal scholarship already exists. The objective is different: to take these decisions as a vantage point for an analysis that operates on two distinct yet complementary levels.

The first level is regulatory: the aim is to examine the European and national regulatory framework invoked by the courts — in particular Regulation (EU) 2024/1689 (AI Act) and Italian Law No. 132/2025 — assessing the technical accuracy of the references made in the reasoning and, above all, questioning the scope of the principle of human oversight beyond the perimeter of high-risk AI systems.

The second level is technical-methodological: starting from the critical elements identified by the courts — the absence of the prompt, the failure to verify references, the misunderstanding of how language models operate — the contribution reflects on the conditions for a responsible use of generative artificial intelligence in the legal context, with particular attention to the topic of legal prompting.

1. The decisions of the first quarter of 2026: AI in Italian civil proceedings

In the first months of 2026, several Italian courts — at both first instance and appellate level — ruled on various uses of artificial intelligence in litigation. Two first-instance decisions bear the same date of 20 February 2026; an appellate judgment from the Turin Court of Appeal followed on 19 March 2026. This body of decisions, concentrated within a narrow timeframe, offers a privileged vantage point for mapping the state of Italian judicial thinking on the matter.

This contribution focuses on the decisions of the Ferrara, Syracuse, and Turin Court of Appeal, which present the most relevant aspects from the regulatory and technical-methodological perspectives that concern us here. A brief mention will also be made of the judgment of the Milan Tribunal, Employment Division, of 11 February 2026 (No. 721/2026), in which the use of AI concerned not the lawyer’s advocacy but the activity of a party-appointed technical expert (CTP), introducing a further dimension — the interaction between AI tools and expert evidence in litigation — that would merit separate analysis.

1.1. Tribunal of Ferrara, order of 20 February 2026 (r.g. 2107/2025)

The Ferrara case presents, in my view, the most significant set of facts from the perspective that concerns us here. In the context of a petition for a preventive technical consultation under Article 696-bis of the Italian Code of Civil Procedure (c.p.c.), relating to a fatal road accident, the applicant had filed among the documents a “conversation with ChatGPT” (doc. 10), apparently in support of their claims against the motorway operator.

Judge Marianna Cocca classified this filing as tamquam non esset, excluding not only its status as a document in the procedural sense, but also its admissibility as atypical evidence. The reasons underlying this conclusion are essentially threefold: the absence of the query originally submitted to the chatbot (the prompt), the irrelevance of the case law references generated by the system, and the complete lack of human verification of the results.

What distinguishes this decision from prior case law is the level at which it operates: this is not about the use of AI as a drafting support tool for legal submissions — a matter already addressed, among others, by the Tribunal of Florence (order of 14 March 2025), the Lazio Regional Administrative Tribunal (judgment No. 4546 of 3 March 2025), the Tribunal of Latina (judgments Nos. 1034 and 1035 of 24 September 2025), and the Tribunal of Turin (judgment No. 21120 of 16 September 2025) — but rather the attempt to elevate a chatbot’s output to the status of evidentiary material to be submitted in Court. As the judge herself observed, “a further reflection is needed: that on the use of these systems to ‘constitute’ evidentiary material for submission in proceedings”.

At the regulatory level, the order establishes a direct link between the specific case and the European and national regulatory frameworks. Regulation (EU) 2024/1689 (AI Act) is invoked in light of the principles of human oversight and responsible AI, which the judge considers directly applicable in the Italian legal order by virtue of the combined provisions of Articles 11 and 117 of the Italian Constitution. Italian Law No. 132/2025 is also referenced, specifically Article 13(2), which imposes on intellectual professionals the obligation to inform the client of the use of AI systems — an obligation for which there was no evidence in the power of attorney granted by the applicant.

1.2. Tribunal of Syracuse, judgment No. 338 of 20 February 2026

On the same day, the Tribunal of Syracuse (Judge Alfredo Spitaleri) dealt with a different but complementary set of facts. In a civil case concerning a sublease agreement, the plaintiff’s counsel included in their written submissions four purported Supreme Court of Cassation precedents, complete with quoted passages, which, upon verification through the Cassation’s Electronic Documentation Centre (CED) and legal databases, were found to be non-existent.

The Tribunal attributed the error to the uncritical use of a generative AI tool, classifying the conduct as grossly negligent and imposing a penalty under Article 96(3) and (4) c.p.c. — comprising an additional payment of EUR 14,103 to the defendant and EUR 2,000 to the Cassa delle ammende.

In my reading, the most relevant aspect of the Syracuse decision lies in its construction of a standard of professional diligence: the judge held that awareness of the hallucination phenomenon in large language models (LLMs) constitutes, in 2026, a matter of common knowledge that can be expected of any legal professional. Ignorance of this phenomenon is not a mitigating factor, but an indicator of gross negligence.

1.3. Turin Court of Appeal, judgment of 19 March 2026

The judgment of the Turin Court of Appeal (Fifth Civil Division, Specialised Section for Enterprise Matters, President Germano Cortese, Reporting Judge Marino, deliberated on 17 March 2026) intervenes on a different level and, in my assessment, introduces an unprecedented dimension in the case law on this matter.

The substantive dispute concerned the validity of European patent EP 1 951 483 (an automatic cutting machine for sheet materials): the appellant sought a declaration of invalidity on the grounds of lack of inventive step. The aspect relevant to this contribution is as follows: to demonstrate that the patented technical solution — the curvature of the walls of the suction box to resist deformation — was the expression of a well-known physical principle (common general knowledge), the appellant filed, both in the final written submissions and in notes dated 27 January 2026, responses obtained from queries submitted to ChatGPT and the Google search engine, also requesting that the chatbot “refer to the knowledge available in 2004” (the year preceding the filing of the contested patent).

The Court of Appeal addressed the issue on two distinct levels.

On the first level, that of procedural admissibility, the filings were declared inadmissible as untimely — with a significant clarification: even queries reproduced within the body of a written submission constitute, in the Court’s view, a “new document” subject to procedural time-bars.

On the second level, that of evidentiary value, the Court formulated an autonomous principle which, in my view, marks an advance over the February case law: the party that submits AI output in proceedings bears the burden of proving the reliability of the system used. The Court specifically identified three aspects in respect of which such proof was lacking: the breadth of the knowledge available to the software, in both quantitative and qualitative terms; the ability to avoid so-called “hallucinations”, i.e., erroneous responses; and the overall validity of the responses provided.

This passage deserves close attention. In Ferrara, ChatGPT’s output was declared tamquam non esset due to the missing prompt and lack of verification. In Syracuse, the failure to verify constituted gross negligence. In Turin, the Court of Appeal takes a further step: it does not merely note the unreliability of the output, but places on the party wishing to rely on AI output in proceedings the burden of pleading and demonstrating the reliability of the system. In the absence of such a demonstration, “no evidentiary value can be recognised in the responses produced”.

A further point of interest concerns the Court’s technical characterisation of AI systems. The reasoning states that ChatGPT and Google “by definition, do not ‘invent’ anything but simply ‘scan’ the knowledge and experience available on the internet and, based on the principle of statistical probability, provide their response”. This formulation, which equates the operation of a search engine with that of a large language model, is technically inaccurate: a search engine indexes and returns existing content. At the same time, an LLM generates linguistic sequences through statistical inference processes on trained parametric models, without “scanning” the internet at the time of generating the response. This observation does not diminish the correctness of the conclusion reached by the Court — which remains, in my view, entirely sound — but it signals, once again, the need for more in-depth technical training of legal practitioners on the characteristics of the AI systems they are called upon to assess.

1.4. A mapping of levels

The decisions examined, read together, and also taking into account the Milan judgment, delineate a taxonomy of the uses of AI in litigation that can be summarised as follows:

  • AI as evidence (Ferrara): the chatbot’s output is filed as a document in support of the party’s claims. Outcome: legal non-existence (tamquam non esset).
  • AI as an argumentative tool (Syracuse): the AI output is used to generate case law references for inclusion in written submissions. Outcome: aggravated liability under Article 96 c.p.c.
  • AI as proof of technical notoriety (Turin, Court of Appeal): the AI output is filed to demonstrate that a technical principle falls within common general knowledge in patent law. Outcome: inadmissibility for untimeliness and, in any event, absence of evidentiary value in the absence of proof of the system’s reliability.
  • AI as expert technical support (Milan): AI is employed by a party-appointed technical expert in an employment dispute. Outcome: issues relating to methodology and the reliability of the expert report.

What unites these cases, beyond their procedural differences, is a single overarching principle: the absence of human oversight over AI output radically undermines its usability in the legal context. The Turin Court of Appeal judgment adds a significant corollary to this principle: it is not sufficient to declare that AI has been used; rather, it must be demonstrated that the system employed is reliable, thereby shifting the focus from mere verification of the output to the qualification of the tool itself.

2. The regulatory framework: AI Act, Italian Law No. 132/2025, and the universal scope of human oversight

2.1. The references made by the courts

As noted, the Ferrara Tribunal’s order expressly references Regulation (EU) 2024/1689 and the principles of human oversight and responsible AI, which the judge considers directly applicable in the Italian legal order by virtue of the combined provisions of Articles 11 and 117 of the Constitution. It also refers to Italian Law No. 132/2025 and the disclosure obligation under Article 13(2).

It should be noted that the expression responsible AI, used by Judge Cocca in her reasoning, does not correspond to an autonomous legal category codified in the AI Act. The Regulation does not employ this expression as a technical-normative term; rather, it is a concept frequently found in the EU institutional lexicon — present in preparatory documents, Commission communications, and the 2019 Ethics Guidelines for Trustworthy AI — which the judge invoked as a general principle.

This regulatory reference, in my view, merits careful analysis for its technical precision.

2.2. The systematic placement of human oversight within the AI Act

Article 14 of Regulation (EU) 2024/1689 governs human oversight with specific reference to high-risk AI systems (Title III, Chapter 3, Section 2). The provision requires that such systems be designed and developed so that natural persons can effectively oversee them during use.

ChatGPT, as a general-purpose AI system (GPAI), does not fall within the scope of Article 14 but is subject to the distinct regime under Title V of the Regulation for general-purpose AI models. The obligations under Title V primarily concern providers of GPAI models (transparency obligations, technical documentation, cooperation with authorities) and do not directly overlap with the human oversight requirements formulated for high-risk systems.

There is, therefore, a tension between the reference made by Judge Cocca — who invokes the AI Act’s human oversight as a principle of general scope — and the Regulation’s normative structure, which anchors that requirement to a specific category of systems.

2.3. Beyond risk classification: human oversight as a cross-cutting principle

However — and this is where the position I have developed in my academic work is situated — I argue that human oversight must be understood as a principle applicable regardless of the risk classification assigned to the AI system.

In a paper published in 2024 (N. Fabiano, “AI Act and Large Language Models (LLMs): When critical issues and privacy impact require human and ethical oversight”, arXiv:2404.00600), I argued that the evolution of Large Language Models necessitates an approach to human oversight that is not limited to systems classified as high-risk. LLMs, while not necessarily falling within the high-risk category under Annex III of the Regulation, exhibit characteristics — in terms of opacity of operation, potential impact on fundamental rights, and the hallucination phenomenon — that make human oversight not merely advisable, but necessary.

This position was further developed in my 2025 contribution (“Affective Computing and Emotional Data: Challenges and Implications in Privacy Regulations, The AI Act, and Ethics in Large Language Models”, arXiv:2509.20153), in which I examined the implications of neural architectures (CNNs and RNNs) underlying emotion recognition systems, showing how even in apparently low-risk contexts, significant concerns may arise in terms of privacy, individual autonomy, and the need for oversight.

The central point of my thesis is the following: human oversight is not (and should not be) merely a technical-normative requirement applicable to a predefined list of high-risk systems. It is a principle grounded in the protection of fundamental rights of the individual and in the need to preserve human decision-making autonomy in the face of systems whose operation inherently presents margins of unpredictability and error.

From this perspective, the reference made by the Ferrara Tribunal — though not fully precise in its systematic placement within the AI Act — captures a substantively correct principle: when a professional uses a generative AI system in the course of their professional activity, human oversight of the output is always necessary, regardless of the formal classification of the system used.

The Turin Court of Appeal’s judgment reinforces this reading by developing an operational corollary: human oversight is not exhausted by ex post verification of the output, but also entails an ex ante assessment of the reliability of the tool. By placing on the party the burden of demonstrating the breadth of the system’s knowledge base, its ability to avoid hallucinations, and the validity of the responses, the Court translates the abstract principle of human oversight into a concrete evidentiary requirement. That is a step that, in my assessment, is consistent with the AI Act’s framework: human oversight is not a formal act but a process that requires competence, awareness, and the capacity to evaluate the tool being used critically.

2.4. Italian Law No. 132/2025 and the disclosure obligation

The reference to Article 13(2) of Italian Law No. 132/2025 made in the Ferrara order introduces a further relevant dimension. The provision imposes on intellectual professionals the obligation to inform the client if they intend to use artificial intelligence systems in the course of their professional engagement, establishing that the use of AI is permitted exclusively in a support function, with human intellectual work predominating.

A preliminary clarification is necessary here: Italian Law No. 132/2025 is not, in a technical sense, a transposition measure for the AI Act. Regulation (EU) 2024/1689 is directly applicable in the Member States without the need for transposition. The Italian law positions itself rather as complementary legislation that intervenes in the areas of competence left to Member States by the European Regulation, introducing, among other things, specific obligations for intellectual professions that the AI Act does not directly regulate. Italian Law No. 132/2025 itself provides, in Article 1(2), that its provisions are to be interpreted and applied “in conformity with” Regulation (EU) 2024/1689.

Judge Cocca observed that, “there being no evidence, at least in the power of attorney granted by the applicant, of the notice required by Article 13(2) of Italian Law No. 132/2025”, a violation of the transparency obligation towards the client is established.

This dimension, in my view, reinforces the reading of human oversight as a multi-layered principle: it is not solely about the professional’s verification of the system’s output (technical oversight), but also about transparency towards the recipient of the professional service (relational oversight). The professional who uses AI must be transparent both towards the Court and towards their own client — a dual accountability obligation supported by the AI Act’s overall framework and complementary national legislation.

3.1. The absence of the prompt in the Ferrara order

One of the central elements of the Ferrara order’s reasoning is its censure of the absence of the query (prompt) submitted to the chatbot. The judge observed that the document produced was “manifestly partial and, therefore, potentially misleading, lacking the query underlying the request submitted to the AI”.

This seemingly procedural observation opens a broader reflection. If the absence of the prompt is an element that contributes to the evidentiary irrelevance of the output, the question inevitably arises: would the presence of the prompt, together with verification of the results, have led to a different qualification?

3.2. The conditions for a methodologically sound use

It is my opinion that the answer to this question cannot be unconditionally affirmative. Nonetheless, the case law under examination enables us to identify, a contrario, the minimum elements that a methodologically responsible use of AI in the legal context should exhibit:

a) Traceability of the prompt. The query submitted to the system must be documented and made available. Without the prompt, it is impossible to assess the context of the response, any specific instructions given to the system, and the user’s level of awareness regarding the tool’s limitations. Prompt traceability is not a formality: it is a necessary condition for any verification, whether by the professional or by the Court.

b) Verification of primary sources. Any legislative, case law, or scholarly reference generated by an AI system must be independently verified against official legal databases. As the Tribunal of Syracuse effectively observed, generative AI models do not constitute case law databases, but systems that produce statistically plausible linguistic sequences without guaranteeing the truthfulness of the information. This verification is not an option: it is a professional obligation whose breach, as demonstrated by the Syracuse judgment, constitutes gross negligence.

c) Disclosure of AI use. Consistent with Article 13(2) of Italian Law No. 132/2025, the professional must disclose the use of AI systems, both in the power of attorney granted by the client and, in my view, in the documents filed in Court. Transparency regarding the use of the tool serves the purpose of judicial oversight and the protection of the adversarial process.

d) Awareness of technical limitations. The professional who uses a generative AI system must be aware of its limitations — first and foremost, the hallucination phenomenon — and this awareness, as stated by the Tribunal of Syracuse, now constitutes common knowledge in 2026.

e) Qualification of the system’s reliability. The Turin Court of Appeal’s judgment adds a further requirement: anyone wishing to rely on AI output in proceedings must be able to document the reliability of the system used — in terms of its knowledge base, its ability to avoid errors, and the overall validity of its responses. That is a requirement that shifts the perspective from the mere verification of the output to the assessment of the tool itself, and that places on the professional a burden of technical competence that goes beyond merely consulting a chatbot.

From these considerations, a topic of growing relevance emerges: legal prompting — understood as the ability to formulate queries to an AI system in a methodologically correct manner, with awareness of the tool’s limitations and directed towards obtaining useful and verifiable results — is not a mere technical expedient, but a genuine professional competence.

The point is relevant from a regulatory perspective. The AI Act, in its overall framework, rests on the practical assumption that users of AI systems — and professionals all the more so — possess the competences necessary for a responsible use of the technology. Similarly, Italian Law No. 132/2025 introduces obligations that presuppose, on the part of the professional, an understanding of the operation and limitations of the systems used.

The current case law landscape, however, demonstrates that this competence is frequently absent. The case of the Ferrara lawyer, who filed a conversation with ChatGPT without the original prompt and without any verification of the cited references, and that of the appellant in Turin, who asked ChatGPT to “refer to the knowledge available in 2004” — as if a language model could selectively circumscribe its knowledge base to a specific year — are emblematic of an approach in which the tool is used without the slightest awareness of its characteristics and limitations.

The challenge that emerges is therefore twofold: on the one hand, the training of professionals in the responsible use of generative AI tools; on the other, the definition of standards and protocols that render the principle of human oversight operational in the interaction between professionals and AI systems.

4. The international perspective

4.1. Mata v. Avianca: the US precedent

The case of Mata v. Avianca, Inc. (678 F. Supp. 3d 443, S.D.N.Y. 2023), decided by Judge P. Kevin Castel, is generally regarded as the seminal case in the body of case law on the improper use of generative AI in the forensic context.

In that case, two lawyers had included in their written submissions references to six non-existent judicial decisions generated by ChatGPT, and had then insisted on their existence even after both the opposing party and the judge had expressed doubts. The judge found that the lawyers’ conduct constituted subjective bad faith sufficient to impose sanctions under Rule 11 of the Federal Rules of Civil Procedure, ordering a pecuniary sanction of USD 5,000 and requiring the lawyers to send letters of apology to the judges falsely identified as the authors of the fabricated opinions.

The Mata v. Avianca case, however, is situated in a regulatory context profoundly different from the European one. The United States does not have a comprehensive regulatory framework on artificial intelligence comparable to the European AI Act. The sanctions imposed in Mata v. Avianca are based on general procedural instruments (Rule 11) and on the judge’s inherent authority, not on normative principles specifically dedicated to human oversight of AI.

4.2. The CCBE Guide of 2 October 2025

At the European level, the Council of Bars and Law Societies of Europe (CCBE) published on 2 October 2025 the Guide on the Use of Generative AI by Lawyers, a document aimed at raising awareness within the legal profession about the risks and opportunities of generative AI.

The CCBE Guide highlights, among other risks, the possibility that generative AI systems may produce factually incorrect or illogical outputs (hallucinations), the retention of client data by AI systems without the user’s awareness, and the need to preserve professional independence in relation to algorithmic biases. In terms of principles, the Guide emphasises the need for independent verification of outputs, transparency towards the client, and professional competence in the use of technology.

The CCBE Guide, while not binding, represents a significant contribution to the definition of professional standards for the use of AI by European lawyers. It is noteworthy that the principles set out in the Guide — professional competence, independent verification, transparency — substantially coincide with the elements that the Italian courts have identified, through case law, as conditions for a responsible use of AI in proceedings.

4.3. The structural advantage of the European approach

The comparison between the US and European experience reveals, in my reading, a structural advantage of the European Union’s approach. While in the United States the response to the problem of improper use of AI in proceedings relies on general procedural instruments and case-by-case judicial responses, Europe has a codified regulatory framework — the AI Act — that provides specific legal principles and categories.

However, this structural advantage risks remaining partially unexpressed if the regulatory framework is not applied with adequate technical understanding. As I observed in the analysis of the Ferrara order, the reference to the AI Act’s human oversight is substantively correct in principle, but not fully precise in its systematic placement. Similarly, the characterisation offered by the Turin Court of Appeal — which describes ChatGPT and Google as systems that “scan” the knowledge available on the internet — reveals a still approximate understanding of the difference between a search engine and a large language model. That signals the need for more in-depth training of legal practitioners — judges included — on the categories and structure of the European Regulation, as well as on the technical characteristics of the AI systems they are called upon to assess.

Conclusions

The decisions of the first quarter of 2026 — Ferrara, Syracuse, Turin, Milan — mark a significant moment in Italian case law on artificial intelligence in litigation. Not so much for the conclusions reached — which, as legal scholars have correctly observed, were “in many respects predictable” — but rather for the questions they open and, in the case of the Turin Court of Appeal judgment, for the jurisdictional level to which the reflection has been elevated.

The first is the scope of human oversight. As I have argued in my academic contributions, human oversight cannot be confined to the periphery of high-risk AI systems; it must be understood as a cross-cutting principle applicable to any use of artificial intelligence that affects the fundamental rights of the individual. The Italian case law of the first quarter of 2026, notwithstanding the systematic imprecisions identified, moves in the same direction. The Turin Court of Appeal, in particular, enriches this principle with operational substance: human oversight also implies the ability to assess and document the reliability of the system used.

The second is the question of legal prompting as a professional competence. The absence of the prompt, the failure to verify references, the misunderstanding of how LLMs work — up to and including the request, made to the chatbot in the Turin case, to “refer to the knowledge” of a specific year, as if a language model could perform a temporal selection of its own parametric base: all elements that the courts have identified as critical failings are not attributable to the tool itself, but to a lack of competence in the use of the tool. Chatbots, as Judge Cocca correctly stated, “remain tools at the service of those who choose to use them”. Responsibility is, and must remain, with the human being.

The third is the question of training. The AI Act, Italian Law No. 132/2025, the CCBE Guide: the regulatory and soft law framework exists and is rapidly evolving. But its effective application requires a technical understanding that — as the decisions examined demonstrate — cannot be taken for granted by either lawyers or judges.

Ultimately, the question is no longer whether artificial intelligence enters the courtroom — it has already entered, and it will not leave. The question is: under what conditions is its use legally governable? The answers provided by the case law of the first quarter of 2026 represent a first, significant attempt to establish those conditions. The challenge, now, is to translate them into a systematic framework that reconciles the regulatory rigour of the European legislator with the technical competence of those who, every day, use these tools in their professional practice.


References


Related Hashtag

#HumanOversight #LegalPrompting #AIAct #AI #ArtificialIntelligence #GenerativeAI #LLM #ResponsibleAI #LegalTech #EURegulation
#ChatGPT #AIinLaw #DigitalRights #FundamentalRights #CCBE #LegalInnovation #AIGovernance #ProfessionalDiligence #AICompliance #GPAI