Post-marketing surveillance of FDA regulated products is critical for identifying potential adverse events (AEs) in the real-world population. An infrastructure leveraging near real time data to facilitate early signal detection may aid FDA’s mission in continual assessment of products’ risk profiles.
Our goal is to establish an active surveillance system that collect, annotate, standardise, evaluate and present data from public open sources using advanced data science technology to enhance FDA’s ability for early safety signal detection.
We propose to use an artificial intelligence (AI) based approach to detect early safety signals from social media sites, such as Reddit, Twitter, or Web-MD, as AI techniques excel at extracting meaningful patterns from large volume of ambiguous data. To augment the AI based detection system, signals detected from social media data can be evaluated in the context of other data sources, such as FDA Reporting Systems FAERS and VAERS.
We will establish an infrastructure to mine social media data for safety signals in near real time through following steps:
– Select social media sites and collect data where potential AEs are reported
– Extract language fragments from the sample data using natural language processing techniques
– Annotate key words from the sample data into standardised AE terminologies, supplemented with key words from FAERS, VAERS
– Build an extensive list of key words for AEs from the initial list by applying Word Embedding techniques on a large sample of social media data
– Build a supervised ML model for determining potential safety signals
– Aggregate and present the model results in a dynamic interpretable dashboard with geographic and demographic information
This infrastructure will likely complement FDA’s current surveillance networks, enhancing the early detection of safety signals warranting further investigation and systematic examination.