Peter, You'll like this example then from a manuscript we submitted to MedInfo: "It is important to point out that while some system annotation types scored really well using the geometric mean method to identify best-at-task annotation systems, on examination, since our method was unable to provide lexical disambiguation of terms, there were some misclassifications. An example was for the entity Speed of Vehicle where the system cTAKES perform very well with the MedicationsMention annotation type. On further examination, the terms that provided a match were “speed” and “mph,” which have different contextual meanings from those having to do with physical measurement with respect to velocity. In this case, “speed” and “mph” are common street drugs..."
Greg-- On Sat, May 18, 2019 at 3:12 AM Peter Abramowitsch <[email protected]> wrote: > Greg, Thanks for these links. I really enjoy discussions of this kind and > am glad to see that someone is trying these knowledge based approaches and > reporting back. I've played with the Wordnet APIs and believe that it is > possible to use the hyper/hypo-nym constructs to help score different > interpretations of ambiguous terms. Additionally, I think Ngram fitting > can be used to help rate the relevance of one definition over another. > But I'd bet that the effectiveness these approaches is highly dependent on > grammatically complete and correct text. Clinical notes are another > thing. > > I had a perfect example of this problem the other day. A note stating > something like "nursing care resumed after 12pm". Ctakes had tagged this > with both lactation-related and nursing-service-related CUIs. But the > patient was an elderly man. Clearly the context was not to be found in the > grammar but in the clinical setting....Thus there is a kind of meta context > (patient's age, gender, disease state) that could also contribute to > disambiguation. This could be achieved by ML methods trained on marked up > notes... very labor intensive, or by some kind of rules mechanism, but that > would also be labor intensive - a never-to-be-finished effort. These might > require the creation of an instant/lightweight VMR to structure the > contextual elements from the note that the scoring mechanism would reason > over. But I'd prefer a Campari and soda. > > > > On Sat, May 18, 2019 at 3:24 AM Greg Silverman <[email protected]> wrote: > > > https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111590/ > > > > On Fri, May 17, 2019 at 8:23 PM Greg Silverman <[email protected]> wrote: > > > > > Yes, and regarding your last paragraph: This is where disambiguation > > comes > > > into play. Here is one method: > > > > > > https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume23/montoyo05a-html/node9.html > > > > > > I'm not sure how either MetaMap or BioMedICUS do disambiguation, but > > since > > > are both open source, they would be potential resources.. > > > > > > Greg-- > > > > > > On Fri, May 17, 2019 at 2:17 AM Peter Abramowitsch < > > > [email protected]> wrote: > > > > > >> Seems like some kind of simple heuristic should work: Isn't it > just a > > >> case of looking at the in/out text offsets of the source text for an > > >> identified annotation and then comparing that with the canonical text > of > > >> the CUI or SnomedID. If the source text is just a few of characters > > (say > > >> less than 5) and the Levenstein difference between it and the > canonical > > >> text is > than the length of the source text, you're pretty sure to > > have > > >> an acronym. > > >> > > >> For instance if cTakes finds "MI" and assigns SNOMED 22298006 or > CUI > > >> C0027051 with canonical text "Myocardial Infarction"*, *then with the > > >> in/out offsets into the text you should be able to run this heuristic > > >> > > >> The problem (and I see this in my work) is that many acronyms have > > >> multiple > > >> meanings. Thus, you may accurately be able to tell that your > identified > > >> concept came from an acronym, but it was the wrong concept!! > > >> > > >> Peter > > >> > > >> On Thu, May 16, 2019 at 4:31 AM Greg Silverman <[email protected]> wrote: > > >> > > >> > Got it! > > >> > > > >> > Yes, I understand the formidability, given the need for > > disambiguation, > > >> > etc. Was just curious if this existed. > > >> > > > >> > Thanks! > > >> > > > >> > > > >> > On Wed, May 15, 2019 at 9:11 PM Finan, Sean < > > >> > [email protected]> wrote: > > >> > > > >> > > Hi Greg, > > >> > > > > >> > > Ok, that gives me a great vector toward addressing your needs. > > >> > > > > >> > > I don't know of any ctakes components that indicate whether or not > > >> > > discovered concepts come from acronyms, abbreviations or -replete- > > >> text > > >> > > mentions. > > >> > > > > >> > > There should be something that does that. Open source ----> Any > > >> > > champions available? > > >> > > > > >> > > Right now no abbreviation or metonym information is provided in > the > > >> > > standard components. If it can be extruded from source then it > > >> should > > >> > be > > >> > > provided. > > >> > > > > >> > > If anybody has such a component, please let us know ! This is a > > >> > > formidable (imio) nlp problem, so call your kudos with a solution! > > >> > > > > >> > > Sean > > >> > > > > >> > > ________________________________________ > > >> > > From: Greg Silverman <[email protected]> > > >> > > Sent: Wednesday, May 15, 2019 9:21 PM > > >> > > To: [email protected] > > >> > > Subject: Re: acronyms/abbreviations [EXTERNAL] > > >> > > > > >> > > I'm just wondering how acronyms are identified as acronyms in > cTAKES > > >> (for > > >> > > example, in MetaMap, there is an attribute in the Document > > annotation > > >> > with > > >> > > ids of where they are in the Utterance annotation; and in > > BioMedICUS, > > >> > there > > >> > > is an acronym annotation type, etc.). From examining the XMI CAS, > it > > >> is > > >> > not > > >> > > obvious. > > >> > > > > >> > > We're extracting the desired annotations from the XMI CAS using a > > >> custom > > >> > > Groovy client. > > >> > > > > >> > > Thanks! > > >> > > > > >> > > On Wed, May 15, 2019 at 7:43 PM Finan, Sean < > > >> > > [email protected]> wrote: > > >> > > > > >> > > > Hi Greg, > > >> > > > > > >> > > > What exactly do you need ? > > >> > > > > > >> > > > There are a lot of output components that can produce different > > >> formats > > >> > > > containing various types of information. > > >> > > > > > >> > > > Do you prefer to parse ml ? Or is columnized text output ok? > > Does > > >> > this > > >> > > > go to a post-processing engine or a human user? > > >> > > > > > >> > > > Thanks, > > >> > > > > > >> > > > Sean > > >> > > > ________________________________________ > > >> > > > From: Greg Silverman <[email protected]> > > >> > > > Sent: Wednesday, May 15, 2019 7:09 PM > > >> > > > To: [email protected] > > >> > > > Subject: acronyms/abbreviations [EXTERNAL] > > >> > > > > > >> > > > How can I get these from the XMI annotations? > > >> > > > > > >> > > > Thanks! > > >> > > > > > >> > > > Greg-- > > >> > > > > > >> > > > -- > > >> > > > Greg M. Silverman > > >> > > > Senior Systems Developer > > >> > > > NLP/IE < > > >> > > > > > >> > > > > >> > > > >> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Fj9pHse59o_GfrCnR_sqZ7ibEmMju2GDRj6hmEg5s9U&s=taqRUWLVp4l5699x1GSXNfIK6WkZXiAgKnA3CPmlfWk&e= > > >> > > > > > > >> > > > University of Minnesota > > >> > > > [email protected] > > >> > > > > > >> > > > › evaluate-it.org ‹ > > >> > > > > > >> > > > > >> > > > > >> > > -- > > >> > > Greg M. Silverman > > >> > > Senior Systems Developer > > >> > > NLP/IE < > > >> > > > > >> > > > >> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=DSQkibRULBYY2ijgCfGWGPmrKD7gdrLjBbvnTbXozsA&s=pTRmMExWf-ju3IjLOdTelulzu0JW399BumarcAx5tRw&e= > > >> > > > > > >> > > University of Minnesota > > >> > > [email protected] > > >> > > > > >> > > › evaluate-it.org ‹ > > >> > > > > >> > > > >> > > > >> > -- > > >> > Greg M. Silverman > > >> > Senior Systems Developer > > >> > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > > >> > University of Minnesota > > >> > [email protected] > > >> > > > >> > › evaluate-it.org ‹ > > >> > > > >> > > > > > > > > > -- > > > Greg M. Silverman > > > Senior Systems Developer > > > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > > > University of Minnesota > > > [email protected] > > > > > > › evaluate-it.org ‹ > > > > > > > > > -- > > Greg M. Silverman > > Senior Systems Developer > > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > > University of Minnesota > > [email protected] > > > > › evaluate-it.org ‹ > > > -- Greg M. Silverman Senior Systems Developer NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> University of Minnesota [email protected] › evaluate-it.org ‹
