Doctors have thousands of symptoms and thousands of illnesses that they study for and encounter in practice. However, even the best doctors cannot know every symptom or every illness. Often, they rely on a caricature of the illness. Chicken pox is a break out in itchy pox all over the body. Shingles exhibits a red textured rash and blisters around the torso. However, they do not usually know the symptoms leading up to the full breakout. In many cases, they cannot render a suggested diagnose until an obvious symptom is exhibited. A headache, light sensitivity, itching, and tingling may appear days or weeks ahead of the rash.
Instead of playing a passive role, patients could be empowered more effectively to help identify and diagnose their problems. A patient knowns his or her symptoms. But they need an effective resource that can help put symptoms together and lead to actionable information.
What it does
Symptomatic gives you the tools to diagnose yourself accurately and without undue fear that you’ve come down with an incredibly rare and untreatable form of stomach cancer…because there is a 99% chance that you have a problem that is far more easy to treat.
Symptomatic focuses on your symptoms and the most likely diagnosis to your illness so you can go to your doctor better informed and take charge of your care.
Symptomatic. Power to the patient.
How it was built
Python, BeautifulSoup and Pandas were used to mine data from the clinical websites such as WebMD. Symptom, disease and treatment data have natural connections among each other and using graph-based database is an intuitive way to store and analyze such data. Therefore, Neo4J Graph Database is used to store the interconnected data. Further graph-based algorithms have been designed and implemented for more features, such as showing how one disease would evolve into another when certain symptoms start to appear and the related probability.
There are massive amount of clinical data available online. Finding trustworthy data sources from all the available options online is a challenge, since the information has to be accurate or else people’s lives will be negatively impacted. Designing a graph-based algorithm that infers the probability of future state of a graph based on its pre-existing connections involved a long process of research and testing. It was the greatest challenge for this application.
The right kind of data structure (graph) was used to store the data. Most of the data acquisition was done through the Python program. A good third party graph-based database provider was found (Neo4J) and significantly reduced the workload.
Neo4J is definitely an amazing development. It supports I/O to many Python network analysis libraries such as NetworkX. The language (Cypher) for using Neo4J is learned through the data acquisition part.
What’s next for Symptomatic
More advanced graph-based algorithms will be designed and implemented so that users will not only know what disease they have, but also how their disease might develop when certain symptoms start to appear.
Personal Highlights and Accomplishments
For this project, I am especially proud of several significant accomplishments, which include:
- I conducted a comprehensive literature search on patient self-diagnosis tools and symptoms for common disease.
- I performed market and competitor analysis in patient self-diagnosis tools and developed a business plan for market entry.
- I implemented parallel Python programs to mine clinical data from Internet efficiently.
- I adopted solutions to efficiently store data in graph format and learned Cypher, a graph-based databased query tool.
- I conducting research on inferring the probability of future dynamic graph state using historical graph data and significant temporal pattern changes.