Re: acronyms/abbreviations [EXTERNAL]

Peter Abramowitsch Sun, 19 May 2019 09:41:52 -0700

OMG,  I hadn't even thought of "ephemeral vocabulary".  Great example!


Peter

On Sun, May 19, 2019 at 6:05 PM Greg Silverman <[email protected]> wrote:

> Peter,
> You'll like this example then from a manuscript we submitted to MedInfo:
> "It is important to point out that while some system annotation types
> scored really well using the geometric mean method to identify best-at-task
> annotation systems,  on examination, since our method was unable to provide
> lexical disambiguation of terms, there were some misclassifications. An
> example was for the entity Speed of Vehicle where the system cTAKES perform
> very well with the MedicationsMention annotation type. On further
> examination, the terms that provided a match were “speed” and “mph,” which
> have different contextual meanings from those having to do with physical
> measurement with respect to velocity.  In this case, “speed” and “mph” are
> common street drugs..."
>
> Greg--
>
>
> On Sat, May 18, 2019 at 3:12 AM Peter Abramowitsch <
> [email protected]>
> wrote:
>
> > Greg,  Thanks for these links.  I really enjoy discussions of this kind
> and
> > am glad to see that someone is trying these knowledge based approaches
> and
> > reporting back.  I've played with the Wordnet APIs and believe that it is
> > possible to use the hyper/hypo-nym constructs to help score different
> > interpretations of ambiguous terms.  Additionally, I think Ngram fitting
> > can be used to help rate the relevance of one definition over another.
> > But I'd bet that the effectiveness these approaches is highly dependent
> on
> > grammatically complete and correct text.   Clinical notes are another
> > thing.
> >
> > I had a perfect example of this problem the other day.   A note stating
> > something like "nursing care resumed after 12pm".  Ctakes had tagged this
> > with both lactation-related and nursing-service-related CUIs.  But the
> > patient was an elderly man.  Clearly the context was not to be found in
> the
> > grammar but in the clinical setting....Thus there is a kind of meta
> context
> > (patient's age, gender, disease state) that could also contribute to
> > disambiguation.  This could be achieved by ML methods trained on marked
> up
> > notes... very labor intensive, or by some kind of rules mechanism, but
> that
> > would also be labor intensive - a never-to-be-finished effort.  These
> might
> > require the creation of an instant/lightweight VMR to structure the
> > contextual elements from the note that the scoring mechanism would reason
> > over.    But I'd prefer a Campari and soda.
> >
> >
> >
> > On Sat, May 18, 2019 at 3:24 AM Greg Silverman <[email protected]> wrote:
> >
> > > https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111590/
> > >
> > > On Fri, May 17, 2019 at 8:23 PM Greg Silverman <[email protected]> wrote:
> > >
> > > > Yes, and regarding your last paragraph: This is where disambiguation
> > > comes
> > > > into play. Here is one method:
> > > >
> > >
> >
> https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume23/montoyo05a-html/node9.html
> > > >
> > > > I'm not sure how either MetaMap or BioMedICUS do disambiguation, but
> > > since
> > > > are both open source, they would be potential resources..
> > > >
> > > > Greg--
> > > >
> > > > On Fri, May 17, 2019 at 2:17 AM Peter Abramowitsch <
> > > > [email protected]> wrote:
> > > >
> > > >> Seems like some kind of simple heuristic should work:    Isn't it
> > just a
> > > >> case of looking at the in/out text offsets of the source text for an
> > > >> identified annotation and then comparing that with the canonical
> text
> > of
> > > >> the CUI or SnomedID.   If the source text is just a few of
> characters
> > > (say
> > > >> less than 5) and the Levenstein difference between it and the
> > canonical
> > > >> text is > than the length of the source text,  you're pretty sure to
> > > have
> > > >> an acronym.
> > > >>
> > > >> For instance if cTakes finds   "MI" and assigns SNOMED  22298006 or
> > CUI
> > > >> C0027051 with canonical text "Myocardial Infarction"*, *then with
> the
> > > >> in/out offsets into the text you should be able to run this
> heuristic
> > > >>
> > > >> The problem (and I see this in my work) is that many acronyms have
> > > >> multiple
> > > >> meanings.  Thus, you may accurately be able to tell that your
> > identified
> > > >> concept came from an acronym, but it was the wrong concept!!
> > > >>
> > > >> Peter
> > > >>
> > > >> On Thu, May 16, 2019 at 4:31 AM Greg Silverman <[email protected]> wrote:
> > > >>
> > > >> > Got it!
> > > >> >
> > > >> > Yes, I understand the formidability, given the need for
> > > disambiguation,
> > > >> > etc. Was just curious if this existed.
> > > >> >
> > > >> > Thanks!
> > > >> >
> > > >> >
> > > >> > On Wed, May 15, 2019 at 9:11 PM Finan, Sean <
> > > >> > [email protected]> wrote:
> > > >> >
> > > >> > > Hi Greg,
> > > >> > >
> > > >> > > Ok, that gives me a great vector toward addressing your needs.
> > > >> > >
> > > >> > > I don't know of any ctakes components that indicate whether or
> not
> > > >> > > discovered concepts come from acronyms, abbreviations or
> -replete-
> > > >> text
> > > >> > > mentions.
> > > >> > >
> > > >> > > There should be something that does that.   Open source ---->
> Any
> > > >> > > champions available?
> > > >> > >
> > > >> > > Right now no abbreviation or metonym information is provided in
> > the
> > > >> > > standard components.    If it can be extruded from source then
> it
> > > >> should
> > > >> > be
> > > >> > > provided.
> > > >> > >
> > > >> > > If anybody has such a component, please let us know !   This is
> a
> > > >> > > formidable (imio) nlp problem, so call your kudos with a
> solution!
> > > >> > >
> > > >> > > Sean
> > > >> > >
> > > >> > > ________________________________________
> > > >> > > From: Greg Silverman <[email protected]>
> > > >> > > Sent: Wednesday, May 15, 2019 9:21 PM
> > > >> > > To: [email protected]
> > > >> > > Subject: Re: acronyms/abbreviations [EXTERNAL]
> > > >> > >
> > > >> > > I'm just wondering how acronyms are identified as acronyms in
> > cTAKES
> > > >> (for
> > > >> > > example, in MetaMap, there is an attribute in the Document
> > > annotation
> > > >> > with
> > > >> > > ids of where they are in the Utterance annotation; and in
> > > BioMedICUS,
> > > >> > there
> > > >> > > is an acronym annotation type, etc.). From examining the XMI
> CAS,
> > it
> > > >> is
> > > >> > not
> > > >> > > obvious.
> > > >> > >
> > > >> > > We're extracting the desired annotations from the XMI CAS using
> a
> > > >> custom
> > > >> > > Groovy client.
> > > >> > >
> > > >> > > Thanks!
> > > >> > >
> > > >> > > On Wed, May 15, 2019 at 7:43 PM Finan, Sean <
> > > >> > > [email protected]> wrote:
> > > >> > >
> > > >> > > > Hi Greg,
> > > >> > > >
> > > >> > > > What exactly do you need ?
> > > >> > > >
> > > >> > > > There are a lot of output components that can produce
> different
> > > >> formats
> > > >> > > > containing various types of information.
> > > >> > > >
> > > >> > > > Do you prefer to parse ml ?  Or is columnized text output ok?
> > > Does
> > > >> > this
> > > >> > > > go to a post-processing engine or a human user?
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > >
> > > >> > > > Sean
> > > >> > > > ________________________________________
> > > >> > > > From: Greg Silverman <[email protected]>
> > > >> > > > Sent: Wednesday, May 15, 2019 7:09 PM
> > > >> > > > To: [email protected]
> > > >> > > > Subject: acronyms/abbreviations [EXTERNAL]
> > > >> > > >
> > > >> > > > How can I get these from the XMI annotations?
> > > >> > > >
> > > >> > > > Thanks!
> > > >> > > >
> > > >> > > > Greg--
> > > >> > > >
> > > >> > > > --
> > > >> > > > Greg M. Silverman
> > > >> > > > Senior Systems Developer
> > > >> > > > NLP/IE <
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Fj9pHse59o_GfrCnR_sqZ7ibEmMju2GDRj6hmEg5s9U&s=taqRUWLVp4l5699x1GSXNfIK6WkZXiAgKnA3CPmlfWk&e=
> > > >> > > > >
> > > >> > > > University of Minnesota
> > > >> > > > [email protected]
> > > >> > > >
> > > >> > > >  ›  evaluate-it.org  ‹
> > > >> > > >
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > Greg M. Silverman
> > > >> > > Senior Systems Developer
> > > >> > > NLP/IE <
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=DSQkibRULBYY2ijgCfGWGPmrKD7gdrLjBbvnTbXozsA&s=pTRmMExWf-ju3IjLOdTelulzu0JW399BumarcAx5tRw&e=
> > > >> > > >
> > > >> > > University of Minnesota
> > > >> > > [email protected]
> > > >> > >
> > > >> > >  ›  evaluate-it.org  ‹
> > > >> > >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > Greg M. Silverman
> > > >> > Senior Systems Developer
> > > >> > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> > > >> > University of Minnesota
> > > >> > [email protected]
> > > >> >
> > > >> >  ›  evaluate-it.org  ‹
> > > >> >
> > > >>
> > > >
> > > >
> > > > --
> > > > Greg M. Silverman
> > > > Senior Systems Developer
> > > > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> > > > University of Minnesota
> > > > [email protected]
> > > >
> > > >  ›  evaluate-it.org  ‹
> > > >
> > >
> > >
> > > --
> > > Greg M. Silverman
> > > Senior Systems Developer
> > > NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> > > University of Minnesota
> > > [email protected]
> > >
> > >  ›  evaluate-it.org  ‹
> > >
> >
>
>
> --
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> University of Minnesota
> [email protected]
>
>  ›  evaluate-it.org  ‹
>

Re: acronyms/abbreviations [EXTERNAL]

Reply via email to