Viewing Study NCT04432961


Ignite Creation Date: 2025-12-17 @ 1:42 PM
Ignite Modification Date: 2025-12-23 @ 4:53 PM
Study NCT ID: NCT04432961
Status: COMPLETED
Last Update Posted: 2021-07-28
First Post: 2020-06-15
Is Possible Gene Therapy: False
Has Adverse Events: False

Brief Title: Natural Language Processing (NLP) Analysis of Free Text Notes to Investigate Coronavirus (COVID-19)
Sponsor: Cambridge University Hospitals NHS Foundation Trust
Organization:

Study Overview

Official Title: A Database and Analytics Study of Free Text Clinical Notes and Structured Data to Investigate Phenotype Associations With Outcomes in Patients With COVID-19
Status: COMPLETED
Status Verified Date: 2021-07
Last Known Status: None
Delayed Posting: No
If Stopped, Why?: Not Stopped
Has Expanded Access: False
If Expanded Access, NCT#: N/A
Has Expanded Access, NCT# Status: N/A
Acronym: None
Brief Summary: A retrospective cohort study investigating clinical notes using Natural Language Processing in combination with structured data from the Electronic Health Record (EHR) to create a database for analytics to identify features associated with outcomes.
Detailed Description: Patients admitted to Cambridge University Hospitals (CUH)with COVID-19 have undergone routine clinical documentation and specific investigation and testing for COVID-19. The pathway for these patients ranges from supportive measures on the ward to deterioration requiring Intensive therapy Unit (ITU) admission and ventilatory support. Patients are also at risk of developing complications such as Acute Kidney Injury and thromboembolism. Identification of the risk factors for these and other outcomes such as the requirement for ventilation remain a challenge and reviewing the clinical data for these patients is critical in the understanding of the relationship between patient characteristics and outcomes.

There is data available in structured fields in the EHR, however, this is sometimes incomplete and inaccurate. An assessment of the free text clinical notes provides an opportunity to fill in the gaps and provide a much richer dataset for evaluation. We plan to use Natural Language Processing (NLP) (a field of machine learning that allows computers to analyse human language) to review Discharge Summaries of patients admitted to hospital with COVID-19 and convert free text data into structured data for analysis.

The NLP techniques developed by Dr Collier's team include methods for coding of free texts to SNOMED CT and other biomedical ontologies. These methods, based on statistical machine learning from human annotated texts, have been benchmarked for scientific texts and social media. In this project we intend to adapt these techniques for patient records. The techniques will require a number of human annotated patient records in order to adapt. The NLP output will be combined with structured data from the EHR and undergo statistical analysis to identify the rates of complications in patients with COVID-19 and risk factors associated with these. This may help to guide management decisions by earlier intervention to prevent poor outcomes in these patients.

Study Oversight

Has Oversight DMC: False
Is a FDA Regulated Drug?: False
Is a FDA Regulated Device?: False
Is an Unapproved Device?: None
Is a PPSD?: None
Is a US Export?: None
Is an FDA AA801 Violation?: