ALTEN Group Case Studies Sharing

A global online technology leader turned to ALTEN’s CIeNET for the creation of a reliable framework for testing and enhancing LLM capabilities and ensuring precise and efficient natural language-to-SQL query generation for complex datasets. The result: significant improvements in the ability to generate accurate queries based on natural language inputs. 

Part of the ALTEN Group, CIeNET is a premier software services provider. CIeNET was approached by one of the world’s leading online technology companies to help in enhancing the capabilities of large language models (LLMs) to more precisely process natural language queries into structured query language (SQL). To overcome the limitations in existing LLMs, CIeNET benchmarked their performance, identified errors, and refined the output using custom datasets.

Challenge: To improve the ability of LLMs to generate SQL queries that correctly answer natural language queries for a specified dataset

Solutions: LLMs that accurately translate natural language queries into SQL (natural language to SQL or NL2SQL)

Benefits:

  • Custom datasets for training and refining LLMs
  • Accurate SQL to answer natural language queries
  • Enhanced business reputation
  • Increased efficiency

The problems with data 

The generation of inaccurate SQL queries from natural language inputs can lead to serious issues, including delivering the wrong information to clients or stakeholders. Erroneous data can also have a negative effect on critical decision-making processes and can even lead to financial losses. Inconsistent or incorrect data can potentially compromise the reliability of a database or lead to the unintentional disclosure of sensitive or confidential information. Furthermore, malformed SQL can even cause crashes in database systems, or result in non-compliance with regulatory requirements and, depending on the nature of the data involved, to legal complications.

Natural language to SQL  

CIeNET set out to analyze the performance and accuracy of LLMs and third-party services in generating SQL queries from natural language questions for a given dataset. The process began with designing and implementing an automated benchmarking system to assess the accuracy, performance, and quality of the LLMs and third-party services in generating SQL in response to the questions. Test cases were created to evaluate the efficacy and quality of the generated SQL, comparing the results over time to other LLMs and authoring datasets. Model training and refinement were then carried out, involving the creation and review of natural language-to-SQL pairs to identify and correct erroneous datasets. Finally, custom database schemas were created and populated with data for use in the creation of new natural language-to-SQL pairs.

The tools 

The LLMs reviewed included Google Gemini, OpenAI ChatGPT and Anthropic Claude3. The database and data warehouse tools were Google BigQuery, Amazon Redshift, Databricks, Snowflake, MySQL, and PostgreSQL. CIeNET developed a custom system for benchmark execution that they named Generative AI beNchmark System (GAINS). They directed the prompt engineering, with a special focus on improved performance, and analyzed the benchmark results to identify issues, referencing various public datasets. Finally, they authored, reviewed and corrected the NL-to-SQL datasets to train and/or refine the LLMs, ensuring that the training was accurate and effective.

For more information, please visit AI and Machine Learning or contact us via marketing@cienet.com.