Using comprehensive data extraction, transformation, and analysis techniques, this project aims to uncover insights, trends, and patterns in FDA medical device recalls. By examining data from the FDA Open API, the project seeks to provide a balanced and evidence-based understanding of the factors contributing to recalls, their impact on patient safety, and the effectiveness of regulatory measures. The goal here is to identify key factors and trends, contributing to informed decision-making and fostering greater awareness and understanding of recall dynamics.
The project utilizes a structured methodology as outlined on GitHub, encompassing a detailed process for data handling and analysis. The repository includes scripts for extracting raw data from the FDA Open API, securely storing it in Amazon S3, and transforming it using Dockerized scripts. Workflow management is orchestrated using Apache Airflow, while data visualization and reporting are handled through Tableau connected to Amazon Redshift. The project also emphasizes robust documentation, collaboration, and version control through Git and GitHub, ensuring a comprehensive and systematic approach to understanding medical device recalls and enhancing healthcare quality.
Click here to view the presentation for the said project