Standard Terminology for Analytics and Data Science

Welcome to my terminology project!

Any software application that tries to make sense of medical information encounters the problem of how to deal with the "impedance mismatch" between the format most medical information is entered in (i.e. free text) and the way databases require it (modeled discrete data).

Most medical informatics data & analytics projects require two parts to solve these problems. The first is a way to represent medical concepts as data and the second is a data model to relate these concepts to each other, the patients, to time, and so forth. Each project may have its own needs with respect to a data model and so many different models are created alost on a per project basis.

The representations of medical concepts, however, is ideally standardized. To represent the concept of "diabetes mellitus", for example, we can assign a unique concept ID 73211009 and every software project and data model that needs to find patterns related to the concept of "diabetes" could use that same concept ID. In fact, having every software project everywhere use the same concept ID for the concept of "diabetes" would have tremendous advantages when the time comes to expand projects and combine data sets.

Medical informaticists working on these problems soon discover that a huge amount of the necessary foundation work has already been done. For decades medical experts have been creating standard terminology systems to represent medical concepts in consistent data ready ways. These systems have names like SNOMED, LOINC, and RXNORM and are kept up to date and consistent by groups of dedicated professionals. They are used in academia and they are used in the practice of medicine. They find application in clinical support, billing, research, machine learning, guidelines and many others.

In my long career in informatics I have never been satisfied by the available tools for searching these terminology systems. There were plenty of tools available and all have their merits but as a typical programmer who needed to search terminology systems often for my work I was never satisfied. So of course I made my own tools. I have used these for years in several forms for my own work.

Recently I put them up on a public web server to help out a colleague and she reacted so positively to the tools and their superiority to what she was using that I opened it up to others at work. The response was so positive I decided to open it up to the public. This web site contains two kinds of pages. These static web pages contain the codes from the major systems that I have used in my career. They are navigable by hyperlink.

These pages also contain links called "tree" and "search" that take you to the application (dubbed The Taxonator by a friend/colleague) that let's you search the terminologies in a very polished manner and even compile value sets from your search. I hope you enjoy using it.

Please let me know what you think.

Go straight to The Taxonator!