Ctakes can detect many forms for each "identified annotation", and it can be trained with further dictionary development, to handle more acronyms, more idiosyncratic speech etc, but it is not perfect. For example, It is not that good at the moment, for detection of temporal context to distinguish medical history from current observation or from family history. It is better, but still not perfect at detecting negated concepts which can occur in many different forms depending on the linguistic patterns of the specific physicians whose notes you are reading. What makes clinical notes particularly tricky as an NLP task is that physicians are rushed - they abbreviate, they misspell, they create staccato phrases instead of sentences, etc. It is not like parsing well-formed published text.
I have not tried all the available annotators, so you may want to experiment and see what works best for you. I hope you were joking about "a couple of algorithms". Prediction is one of the problems that has been addressed by thousands of highly trained experts in diagnostics and clinical informatics. I have found interesting work that was done years ago by some people working on Inference engines using Deontic logic. Prediction is only partially an information handling problem -- it is also a capture problem. It is something that only highly trained observers can get right part of the time, and by observations that do not always become part of the clinical record. If you want to know more about what I'm talking about, try reading "Cutting for Stone" . It is written by one of the world's most distinguished diagnosticians, Abraham Verghese who now teaches at Stanford University. Peter On Wed, May 8, 2019 at 2:10 PM Hari, Sekhar <sekhar.h...@cgi.com> wrote: > Thanks Peter for your insights. Agree, this kind of predictions will need > a couple of algorithms to be trained and work together to get to level of > acceptable accuracy. I'm familiar with the RXNORM and SNOMED contents; but > will dig deeper. > > Do you know if cTAKES can identify events such as "cardiac arrest", > "diabetes" and "pre-term birth"? Likely these are mentioned with different > text representations in the clinical notes. > > Thanks > Sekhar Hari | AI Program Lead | Health Sciences R&D | Asia Pacific > Solutions Delivery Center > +91 814 7027 779 (C) > > -----Original Message----- > From: Peter Abramowitsch <pabramowit...@gmail.com> > Sent: Wednesday, May 8, 2019 2:50 PM > To: dev@ctakes.apache.org > Subject: Re: Reading clinical notes for specific predictions > > Hi Sekhar > > The predictions item in your list of objectives is very tricky and cTakes, > or indeed any software system will only get you part of the way there. CDS > (clinical decision support) researchers have been on this path for many > years and it is clear that even an hybrid human/computational system is > limited in its accuracy & predictive ability. And with medicine, a miss is > as good as a mile - as the saying goes. > > As to your vocabularies question - if you don't already know the SNOMED > clinical ontology, and RxNorm resources I suggest you have a look. cTakes > can fish out the appropriate CUIs and SNOMED term ids, and the ontologies > will help you draw the lateral links through common parents - or in your > specific example, therapeutic classes. > > - Peter > > On Tue, May 7, 2019 at 6:47 PM Hari, Sekhar <sekhar.h...@cgi.com> wrote: > > > Hi there - > > > > I'm trying to predict a few things from clinical notes as follows: > > > > > > 1. Look at the notes and discharge summaries, and predict the > > re-admissions data, cardiac arrests, diabetes, and pre-term birth. > > > > 2. Understand the vocabulary of doctors and pharmacies. For > example, > > recognize that Tylenol and Acetaminophen refer to the same item. Have > > a good understanding of body parts and diseases. The vocabulary is > > domain-specific. > > > > 3. The data is loaded from Cerner and EPIC. > > > > Can somebody help with suggestions on the list of pipelines that can > > be used to achieve (1) and (2) above? Should I also develop a > > machine-learning model along with cTAKES to get the desired results? > > > > Thanks > > Sekhar Hari | AI Program Lead | Health Sciences R&D | Asia Pacific > > Solutions Delivery Center > > +91 814 7027 779 (C) > > > > >