Skip to Content

Exploring Large Language Models in Clinical Research

large language models in clinical research

Applying innovative AI methodologies, like large language models, to solve complex problems is quickly gaining global popularity for a wide variety of uses, with vast potential existing within the clinical research domain.  Large language models (LLMs) can support a variety of tasks including but not limited to responding as a chatbot within an application or website, generating concise product descriptions, summarizing large publications, and analyzing large datasets.

But what are large language models?  

Large language models are advanced machine-learning models that enable the understanding and generation of human-like text. For LLMs to provide contextually relevant responses, they are trained on massive amounts of text and data. GPT-3.5, for example, is trained with a large and diverse set of sources ranging from research and academic papers to web articles.  Training these models extensively makes the use of them extremely powerful. For example, if you were to upload a 30-page research paper to an LLM platform, you could request to receive a 100-word summary of the paper, how to explain the research paper to a 5-year-old, or even count the number of times the letter “a” was used, just to name a few.

Once the definition of an LLM is understood, it quickly becomes apparent that the breadth of use is endless, with clinical research being one industry ripe for application. Despite LLMs’ powerful ability to assist with various aspects of clinical research, there are concerns and hesitations among clinical researchers regarding the adoption of LLMs. We will explore how LLMs can support clinical research, the reasons behind LLM use hesitations, and ways to consider utilizing these models.


How can LLMs be utilized in Clinical Research?

Tools utilizing large language models have the potential to support a wide range of uses in clinical research, with the most beneficial ones streamlining complex tasks and saving organizations both time and money.

  • Analysis of Dark Data
    LLMs can assist with the processing and analysis of vast amounts of clinical data. Meaningful data that is difficult to access and analyze typically comes in the form of electronic health record documents and patient-reported outcome reports. LLMs can easily summarize information within these dark data sources, enabling cohort analysis, candidate identification, outcome trends, and much more.
  • Clinical Decision Support
    LLMs can provide evidence-based recommendations and predictive insights, which can aid clinicians and researchers in predicting patient outcomes based on the available clinical data.
  • Patient Engagement and Support
    LLMs applied to chatbot technologies can offer personalized support and education to research candidates and participants. Offering individuals an accessible means to get their questions answered and receive detailed study information can enhance patient engagement and improve retention rates.


There is curiosity among clinical researchers when it comes to adopting LLMs, but not without hesitation due to several valid concerns.

  • Data Privacy and Security
    Researchers are extremely careful when it comes to maintaining patient privacy during the use of clinical data. This caution can translate into a concern with the use of LLMs, since it is difficult to constrain LLMs to the ingestion of non-personally identifiable information (PII) in certain cases. As an industry, this is one of our biggest daily concerns and for good reason.
  • Output Reasoning
    LLMs often provide answers without clearly explaining how they arrived at the conclusions; however, this can differ across models.  In clinical research, where data transparency and processes are crucial, this lack of reasoning could pose challenges to user adoption and trust.
  • Limited Domain Knowledge
    LLMs may lack specific knowledge, potentially leading to inaccurate responses. Researchers may need to validate the results with domain experts to ensure accuracy and reliability and ensure the models are trained with domain-specific content.

These concerns are not unwarranted, but it is extremely unlikely that these models will disappear anytime soon. In fact, this trend of use is just getting started and likely to grow exponentially.  With any new tool, learning how and when to use it correctly and compliantly can yield massive benefits.


What are the current best practices for using large language models in clinical research?

Education and Training
You’re already there by reading this article!  Continue to familiarize yourself with large language models by learning more about their capabilities, limitations, and potential benefits. This will help dispel misconceptions and build more confidence in the use of LLM tools.

Data Privacy and Security Assurance
While LLMs may not be certified as HIPAA compliant, they can still be used in ways that are HIPAA compliant.  Each organization should assess how to implement strong data privacy and security measures to safeguard patient information used by LLMs. Researchers should ensure HIPAA compliance through the creation of processes and procedures specific to the use of LLMs as well as prevent any PII from being used by them.

Collaboration and Expert Validation
This is one of the more important things to note!  The outputs from large language models should be thoroughly reviewed for accuracy.  Encourage collaboration between LLM experts and clinical research experts to ensure the accuracy and reliability of LLM answers. This collaboration will help address concerns about limited domain knowledge and improve the trustworthiness of LLM-generated information.


Large language models have the potential to revolutionize clinical research. While there are valid concerns and hesitations, addressing these concerns through education, data privacy and security measures, collaboration, and ethical consideration can pave the way for clinical researchers to further adopt LLMs as a powerful tool within their toolbox. By embracing this innovative technology, researchers can enhance data analysis, protocol reviews, patient engagement, and much more, leading to improved outcomes in clinical research and enhanced patient care.

[The first draft of this article was written with the LLM, ChatGPT, and then rewritten by humans]

Originally published in the Society for Clinical Research Sites 2023 InSite Journal Vol. 2 pgs. 20-21

Share On: