Voices
Back to the Voices Edition

Arabic Artificial Intelligence in International Arbitration

Joseph Hearn

Joseph Hearn is a postgraduate law student, finishing his LLM in Legal Practice at BPP University in London before joining Stephenson Harwood as a trainee solicitor. He holds degrees from Cambridge and SOAS. He has studied Arabic in Jordan, Egypt and Oman. His LinkedIn profile is at https://www.linkedin.com/in/joseph-hearn-01/

 

Introduction

 

Artificial intelligence (‘AI’) provides vital tools for international arbitration, but there are problems with that technology, particularly large language models, when used with Arabic. Many AI tools, often built using Large Language Models (‘LLMs’), for language processing, translation and legal research are not refined for the language or draw on inadequate datasets. While some technology companies tried to address these gaps initially, early momentum to do so has slowed.

 

Arabic in international arbitration

 

Arbitration in the Arab world is largely conducted in English or Arabic. ‘Offshore’ seats such as the Dubai International Financial Centre tend to use English or another language mutually agreed by the parties.[1] But Arabic remains the default language in many domestic ‘onshore’ arbitration seats in the Arab world.[2] It is also the primary language of domestic courts in the region.

 

Arabic is also often the language used in documentary evidence in these arbitrations, which require arbitrators and lawyers to have a nuanced understanding of the language. This often entails using translation, which can be a minefield. Translation is an imprecise science, meaning translators often have to use contextually appropriate phrases rather than literal translations. Kılıç v. Turkmenistan hinged on differences within authentic and non-authentic versions of an investment treaty in different languages.[3] The tribunal said, ‘[A]ccurate translation of, for example, a sentence in one language into another, requires something more than a literal and word-for-word translation of each and every word ….’[4]

 

Arabic may also be an ‘anchor’ language, meaning the language of thought from which parties create translated documents such as treaties.[5] Versions of an agreement or treaty in multiple languages or with different anchor languages complicate disputes.[6] These problems often occur when parties seek to submit contextual evidence related to the formation of a contract. This is more common in civil law jurisdictions and international conventions. For example, the United Nations Convention on Contracts for the International Sale of Goods requires the court or arbitral body to consider ‘all relevant circumstances of the case including the negotiations’ when ‘determining the intent of a party’.[7] 

 

Use of AI in international arbitration

 

AI tools, in particular those using LLMs, have opened entirely new possibilities to bridge practical language gaps in cases containing large numbers of documents in Arabic. AI legal tools generally fit into four main categories: generalist LLMs; translation tools; legal-specific small language models and refined LLMs; and case and evidence management software. All these tools can play a role in improving processes and language accessibility in cases involving multiple languages by providing fast and customizable translation, research and processing. These possibilities have given rise to some understandable optimism. International arbitration practitioners in the Middle East and North Africa are showing ‘cautious optimism’ in adopting AI.[8] Looking internationally, the ICC Commission on Arbitration and ADR recently launched a Task Force on AI in International Dispute Resolution. Governments in the Middle East have also encouraged the development and use of these tools. The UAE published its National Strategy for Artificial Intelligence 2031, which envisages AI transforming the economy, and the Emirati government also plans to use AI to write and review legislation.[9] Similarly, Saudi Arabia’s state-owned AI company, Humain, has launched a $10billion venture fund.[10]

 

However, there are also risks associated with excessive reliance on LLMs. They struggle with legal accuracy and sometimes hallucinate. In a recent hearing in the High Court of England and Wales regarding the misuse of AI tools, held under the court’s inherent power to regulate its own procedures, one senior judge warned, ‘Freely available generative artificial intelligence tools, trained on a large language model such as ChatGPT are not capable of conducting reliable legal research.’[11]

 

Small language models and specialised research or evidence management tools, including for technology assisted review (‘TAR’), are less likely to hallucinate or refer to misleading training data. They are especially likely to be helpful in expedited arbitrations, an increasingly popular procedure.[12] But they still have shortcomings, especially in Arabic.

 

AI’s own language gap

 

Arabic training data for AI is limited. LLMs of general use, such as ChatGPT, Claude, or Gemini, are trained and refined largely on English-language material.[13] Their training in Arabic lacks depth.[14] Only 3% of online content is available in Arabic and much of this material is in Modern Standard Arabic, relegating Arabic’s everyday dialects to the status of ‘low-resource’ languages for LLM training purposes.[15] This weakens the accuracy of AI tools for transcription, translation and TAR. 

 

This limitation in training makes those tools potentially inadequate for use in cases where, for example, documents and evidence are in a dialect of Arabic – which is not unusual. And misunderstandings of non-standard varieties of a language can have a fundamental impact on justice.[16] One small study in the US found several examples. These included the Supreme Court of Louisiana holding that a statement in ‘Black English’ – “why don’t you just give me a lawyer dog cause this is not what’s up” – was not a valid request for legal advice that would justify terminating a police interview.[17]

 

There is little research on AI tools exhibiting bias when working in Arabic dialects, but dialects of English face discrimination despite much larger training corpuses.[18] For example, ChatGPT’s GPT-4 was found to exhibit discriminatory bias in its responses to queries posed in non-standard varieties such as ‘Indian English’.[19] 

 

Even when using Modern Standard Arabic, AI tools such as chatbots refined for legal research face problems because legal datasets in Arabic are not comprehensive. Islamic law in Arabic informs the legal systems of much of the Arab world, but as Intisar Rabb, Professor of Islamic Law at Harvard, has written:

‘[There is] a severely low quantity of digitized Islamic sources for AI to produce research results on questions of Islamic law that are reliable, accurate, and representative. Only by increasing the Arabic and Islamic sources, and building tools that deliver metadata that researchers need to procure reliable results, can we begin to build the infrastructure for “Islamic AI” and to pursue questions about Islamic law or ethics with the help of AI.’[20]

 

Finally, researchers have suggested there is a shortage of accurate benchmarks to rate Arabic LLMs.[21] This is a problem for arbitrators and lawyers, who must only use reliable AI tools. The DIFC Courts’ AI Guidelines, for example, require that:

 

‘before submitting AI-generated evidence in the Courts, parties and practitioners should evaluate its reliability, taking into account the AI's training data, algorithms, and potential for bias or inaccuracies.’[22] 

 

Overall, LLMs and associated AI tools for international arbitration have limited capabilities in Arabic compared to English. This is especially true for Arabic dialects used in informal settings, which may appear in documentary evidence within arbitrations. Furthermore, legal datasets and benchmarks are not comprehensive for LLMs and associated tools targeted at Modern Standard Arabic.

 

What next?

 

Some AI companies have responded to these challenges by building Arabic-specific LLMs. In 2023 G42 launched Jais, a 13 billion parameter model named after the highest point in the UAE.[23] The Technology Innovation Institute, funded by the Emirate of Abu Dhabi, is developing the Falcon series of Arabic LLMs. Qatar’s Fanar LLM focuses on Arabic dialects. However, these early efforts are falling behind the products of US and Chinese competitors, such as ChatGPT and DeepSeek. Companies such as G42 are pulling funds from frontier LLM models to focus on less ambitious tools.[24]

 

Smaller interventions may be the way forward. Companies and law firms can refine LLMs using Arabic-language material, build legal-specific small language models or develop legal datasets. Tarjama, a translation technology company based in Dubai, has built Pronoia, one of the few families of Arabic small language models refined for business and legal tasks.[25] Decree Tech maintains a new database of Omani legislation, searchable using AI. These tools mitigate some of the risks of using AI for Arabic tasks in international arbitration, but their capabilities lag demand. The Egyptian government announced last year it was integrating AI-powered transcription in courtrooms.[26] It is unclear whether a tool yet exists for this, especially if it requires transcription of Arabic dialects. Without further targeted work, weaker Arabic-language AI capabilities risk English dominating further as a language of contracts and arbitrations. 
 

[1] Article 21, DIAC Arbitration Rules.

[2] Including the UAE, Saudi Arabia, Sudan and Egypt. 

[3]Kılıç v. Turkmenistan, ICSID Case No. ARB/10/1.

[4] Stephan Wilske, ‘Linguistic and Language Issues in International Arbitration ─ Problems, Pitfalls and Paranoia’ (2016) 9 Contemporary Asia Arbitration Journal 159, 168.

[5] Tibor Várady, ‘Setting the Language (or Languages) of Arbitration’ in Stefan Kröll et al. (ed), Cambridge Compendium of International Commercial and Investment Arbitration (Cambridge University Press 2023).

[6] Michelle J Rozovics, ‘Drafting Multiple-Language Contracts’ (2011) 28 GPSolo 14.

[7] Article 8(3) UN CISG. 

[8] ‘MENA Arbitration Survey 2024’ (Hogan Lovells) 7 <https://www.hoganlovells.com/-/media/project/english-site/our-thinking/publication-pdfs/mena-survey.pdf> accessed 10 July 2025.

[9] Chloe Cornish, ‘UAE Set to Use AI to Write Laws in World First’ Financial Times (20 April 2025) <https://www.ft.com/content/9019cd51-2b55-4175-81a6-eafcf28609c3> accessed 10 July 2025; ‘UAE National Strategy for AI 2031’ <https://ai.gov.ae/wp-content/uploads/2021/07/UAE-National-Strategy-for-Artificial-Intelligence-2031.pdf> accessed 10 July 2025.

[10] Andrew England and Ahmed Al Omran, ‘Saudi Arabia Seeks to Use Financial Might to Muscle into Global AI Industry’ Financial Times (28 May 2025) <https://www.ft.com/content/176c7859-fdda-40d2-92a5-15d570f7accf> accessed 10 July 2025.

[11]Ayinde, R (On the Application Of) v London Borough of Haringey [2025] EWHC 1383 (Admin) (Dame Victoria Sharp P).

[12] Nick Hilborne, ‘Arbitration Specialists Split’ (28 April 2025) <https://www.legalfutures.co.uk/latest-news/arbitration-specialists-split-on-faster-procedures-but-see-role-for-ai> accessed 24 July 2025.

[13] Sara Ruberg, ‘When A.I. Fails the Language Test’ The New York Times (26 July 2024) <https://www.nytimes.com/2024/07/26/technology/ai-language-gap.html> accessed 15 July 2025.

[14] Juan N Pava et al., ‘Mapping the Challenges of LLM Development in Low-Resource Language Contexts’ (Stanford Institute for Human-Centered Artificial Intelligence 2025).

[15] Mafaza Chabane et al., ‘Advancing Low-Resource Dialect Identification’ (2025) 284 Expert Systems with Applications 127816.

[16] ‘Dialectal Due Process’ (2023) 136 Harvard Law Review <https://harvardlawreview.org/print/vol-136/dialectal-due-process/> accessed 16 July 2025.

[17] Ibid.

[18] For analysis of bias in Arabic AI, see Mussa Saidi Abubakari, ‘Overviewing Biases in Generative AI-Powered Models in the Arabic Language: AI Fairness for Sustainable Future’, Achieving Sustainability in Multi-Industry Settings With AI (IGI Global Scientific Publishing 2025) <https://www.igi-global.com/chapter/overviewing-biases-in-generative-ai-powered-models-in-the-arabic-language/www.igi-global.com/chapter/overviewing-biases-in-generative-ai-powered-models-in-the-arabic-language/373872> accessed 25 August 2025.

[19] Ritwik Gupta et al., ‘Linguistic Bias in ChatGPT’ (Berkeley AI Research Blog) <http://bair.berkeley.edu/blog/2024/09/20/linguistic-bias/> accessed 15 July 2025.

[20] Intisar Rabb, ‘Testing AI Research Agents for Islamic Law’ (Islamic Law Blog, 21 March 2025) <https://islamiclaw.blog/2025/03/21/roundtable-the-book-and-ai-part-2-testing-ai-research-agents-for-islamic-law/> accessed 15 July 2025.

[21] Yasser Ashraf et al., ‘Arabic Dataset for LLM Safeguard Evaluation’, Human Language Technologies (Volume 1: Long Papers) (Association for Computational Linguistics 2025) 5529 <https://aclanthology.org/2025.naacl-long.285/> accessed 15 July 2025.

[22] ‘DIFC Courts Practical Guidance Note No. 2 of 2023’ <https://www.difccourts.ae/rules-decisions/practice-directions/practical-guidance-note-no-2-2023-guidelines-use-large-language-models-and-generative-ai-proceedings-difc-courts> accessed 15 July 2025.

[23]Simeon Kerr and Madhumita Murgia, ‘UAE Launches Arabic LLM’ Financial Times (30 August 2023) <https://www.ft.com/content/ab36d481-9e7c-4d18-855d-7d313db0db0d> accessed 22 July 2025.

[24] ‘Mideast Titans Step Back’ Bloomberg UK (6 May 2025) <https://www.bloomberg.com/news/articles/2025-05-06/mideast-titans-step-back-from-ai-model-race-as-us-china-dominate> accessed 22 July 2025.

[25] Carrington Malin, ‘Tarjama& Launches Pronoia LLM’ (Middle East AI News, 14 April 2025) <https://www.middleeastainews.com/p/uae-tarjama-launches-ai-models> accessed 15 July 2025.

[26] Shahd Hashem, ‘Egypt to Integrate AI Transcription Tools in Courtrooms’ Ahram Online (29 August 2024) <https://english.ahram.org.eg/News/530950.aspx> accessed 11 July 2025.