OMG, I hadn't even thought of "ephemeral vocabulary". Great example!
Peter On Sun, May 19, 2019 at 6:05 PM Greg Silverman <[email protected]> wrote: > Peter, > You'll like this example then from a manuscript we submitted to MedInfo: > "It is important to point out that while some system annotation types > scored really well using the geometric mean method to identify best-at-task > annotation systems, on examination, since our method was unable to provide > lexical disambiguation of terms, there were some misclassifications. An > example was for the entity Speed of Vehicle where the system cTAKES perform > very well with the MedicationsMention annotation type. On further > examination, the terms that provided a match were “speed” and “mph,” which > have different contextual meanings from those having to do with physical > measurement with respect to velocity. In this case, “speed” and “mph” are > common street drugs..." > > Greg-- > > > On Sat, May 18, 2019 at 3:12 AM Peter Abramowitsch < > [email protected]> > wrote: > > > Greg, Thanks for these links. I really enjoy discussions of this kind > and > > am glad to see that someone is trying these knowledge based approaches > and > > reporting back. I've played with the Wordnet APIs and believe that it is > > possible to use the hyper/hypo-nym constructs to help score different > > interpretations of ambiguous terms. Additionally, I think Ngram fitting > > can be used to help rate the relevance of one definition over another. > > But I'd bet that the effectiveness these approaches is highly dependent > on > > grammatically complete and correct text. Clinical notes are another > > thing. > > > > I had a perfect example of this problem the other day. A note stating > > something like "nursing care resumed after 12pm". Ctakes had tagged this > > with both lactation-related and nursing-service-related CUIs. But the > > patient was an elderly man. Clearly the context was not to be found in > the > > grammar but in the clinical setting....Thus there is a kind of meta > context > > (patient's age, gender, disease state) that could also contribute to > > disambiguation. This could be achieved by ML methods trained on marked > up > > notes... very labor intensive, or by some kind of rules mechanism, but > that > > would also be labor intensive - a never-to-be-finished effort. These > might > > require the creation of an instant/lightweight VMR to structure the > > contextual elements from the note that the scoring mechanism would reason > > over. But I'd prefer a Campari and soda. > > > > > > > > On Sat, May 18, 2019 at 3:24 AM Greg Silverman <[email protected]> wrote: > > > > > https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111590/ > > > > > > On Fri, May 17, 2019 at 8:23 PM Greg Silverman <[email protected]> wrote: > > > > > > > Yes, and regarding your last paragraph: This is where disambiguation > > > comes > > > > into play. Here is one method: > > > > > > > > > > https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume23/montoyo05a-html/node9.html > > > > > > > > I'm not sure how either MetaMap or BioMedICUS do disambiguation, but > > > since > > > > are both open source, they would be potential resources.. > > > > > > > > Greg-- > > > > > > > > On Fri, May 17, 2019 at 2:17 AM Peter Abramowitsch < > > > > [email protected]> wrote: > > > > > > > >> Seems like some kind of simple heuristic should work: Isn't it > > just a > > > >> case of looking at the in/out text offsets of the source text for an > > > >> identified annotation and then comparing that with the canonical > text > > of > > > >> the CUI or SnomedID. If the source text is just a few of > characters > > > (say > > > >> less than 5) and the Levenstein difference between it and the > > canonical > > > >> text is > than the length of the source text, you're pretty sure to > > > have > > > >> an acronym. > > > >> > > > >> For instance if cTakes finds "MI" and assigns SNOMED 22298006 or > > CUI > > > >> C0027051 with canonical text "Myocardial Infarction"*, *then with > the > > > >> in/out offsets into the text you should be able to run this > heuristic > > > >> > > > >> The problem (and I see this in my work) is that many acronyms have > > > >> multiple > > > >> meanings. Thus, you may accurately be able to tell that your > > identified > > > >> concept came from an acronym, but it was the wrong concept!! > > > >> > > > >> Peter > > > >> > > > >> On Thu, May 16, 2019 at 4:31 AM Greg Silverman <[email protected]> wrote: > > > >> > > > >> > Got it! > > > >> > > > > >> > Yes, I understand the formidability, given the need for > > > disambiguation, > > > >> > etc. Was just curious if this existed. > > > >> > > > > >> > Thanks! > > > >> > > > > >> > > > > >> > On Wed, May 15, 2019 at 9:11 PM Finan, Sean < > > > >> > [email protected]> wrote: > > > >> > > > > >> > > Hi Greg, > > > >> > > > > > >> > > Ok, that gives me a great vector toward addressing your needs. > > > >> > > > > > >> > > I don't know of any ctakes components that indicate whether or > not > > > >> > > discovered concepts come from acronyms, abbreviations or > -replete- > > > >> text > > > >> > > mentions. > > > >> > > > > > >> > > There should be something that does that. Open source ----> > Any > > > >> > > champions available? > > > >> > > > > > >> > > Right now no abbreviation or metonym information is provided in > > the > > > >> > > standard components. If it can be extruded from source then > it > > > >> should > > > >> > be > > > >> > > provided. > > > >> > > > > > >> > > If anybody has such a component, please let us know ! This is > a > > > >> > > formidable (imio) nlp problem, so call your kudos with a > solution! > > > >> > > > > > >> > > Sean > > > >> > > > > > >> > > ________________________________________ > > > >> > > From: Greg Silverman <[email protected]> > > > >> > > Sent: Wednesday, May 15, 2019 9:21 PM > > > >> > > To: [email protected] > > > >> > > Subject: Re: acronyms/abbreviations [EXTERNAL] > > > >> > > > > > >> > > I'm just wondering how acronyms are identified as acronyms in > > cTAKES > > > >> (for > > > >> > > example, in MetaMap, there is an attribute in the Document > > > annotation > > > >> > with > > > >> > > ids of where they are in the Utterance annotation; and in > > > BioMedICUS, > > > >> > there > > > >> > > is an acronym annotation type, etc.). From examining the XMI > CAS, > > it > > > >> is > > > >> > not > > > >> > > obvious. > > > >> > > > > > >> > > We're extracting the desired annotations from the XMI CAS using > a > > > >> custom > > > >> > > Groovy client. > > > >> > > > > > >> > > Thanks! > > > >> > > > > > >> > > On Wed, May 15, 2019 at 7:43 PM Finan, Sean < > > > >> > > [email protected]> wrote: > > > >> > > > > > >> > > > Hi Greg, > > > >> > > > > > > >> > > > What exactly do you need ? > > > >> > > > > > > >> > > > There are a lot of output components that can produce > different > > > >> formats > > > >> > > > containing various types of information. > > > >> > > > > > > >> > > > Do you prefer to parse ml ? Or is columnized text output ok? > > > Does > > > >> > this > > > >> > > > go to a post-processing engine or a human user? > > > >> > > > > > > >> > > > Thanks, > > > >> > > > > > > >> > > > Sean > > > >> > > > ________________________________________ > > > >> > > > From: Greg Silverman <[email protected]> > > > >> > > > Sent: Wednesday, May 15, 2019 7:09 PM > > > >> > > > To: [email protected] > > > >> > > > Subject: acronyms/abbreviations [EXTERNAL] > > > >> > > > > > > >> > > > How can I get these from the XMI annotations? > > > >> > > > > > > >> > > > Thanks! > > > >> > > > > > > >> > > > Greg-- > > > >> > > > > > > >> > > > -- > > > >> > > > Greg M. Silverman > > > >> > > > Senior Systems Developer > > > >> > > > NLP/IE < > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Fj9pHse59o_GfrCnR_sqZ7ibEmMju2GDRj6hmEg5s9U&s=taqRUWLVp4l5699x1GSXNfIK6WkZXiAgKnA3CPmlfWk&e= > > > >> > > > > > > > >> > > > University of Minnesota > > > >> > > > [email protected] > > > >> > > > > > > >> > > > › evaluate-it.org ‹ > > > >> > > > > > > >> > > > > > >> > > > > > >> > > -- > > > >> > > Greg M. Silverman > > > >> > > Senior Systems Developer > > > >> > > NLP/IE < > > > >> > > > > > >> > > > > >> > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=DSQkibRULBYY2ijgCfGWGPmrKD7gdrLjBbvnTbXozsA&s=pTRmMExWf-ju3IjLOdTelulzu0JW399BumarcAx5tRw&e= > > > >> > > > > > > >> > > University of Minnesota > > > >> > > [email protected] > > > >> > > > > > >> > > › evaluate-it.org ‹ > > > >> > > > > > >> > > > > >> > > > > >> > -- > > > >> > Greg M. Silverman > > > >> > Senior Systems Developer > > > >> > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > > > >> > University of Minnesota > > > >> > [email protected] > > > >> > > > > >> > › evaluate-it.org ‹ > > > >> > > > > >> > > > > > > > > > > > > -- > > > > Greg M. Silverman > > > > Senior Systems Developer > > > > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > > > > University of Minnesota > > > > [email protected] > > > > > > > > › evaluate-it.org ‹ > > > > > > > > > > > > > -- > > > Greg M. Silverman > > > Senior Systems Developer > > > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > > > University of Minnesota > > > [email protected] > > > > > > › evaluate-it.org ‹ > > > > > > > > -- > Greg M. Silverman > Senior Systems Developer > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group> > University of Minnesota > [email protected] > > › evaluate-it.org ‹ >
