Re: acronyms/abbreviations [EXTERNAL]

Peter Abramowitsch Fri, 17 May 2019 00:18:43 -0700

Seems like some kind of simple heuristic should work:    Isn't it just a
case of looking at the in/out text offsets of the source text for an
identified annotation and then comparing that with the canonical text of
the CUI or SnomedID.   If the source text is just a few of characters (say
less than 5) and the Levenstein difference between it and the canonical
text is > than the length of the source text,  you're pretty sure to have
an acronym.


For instance if cTakes finds   "MI" and assigns SNOMED  22298006 or CUI
C0027051 with canonical text "Myocardial Infarction"*, *then with the
in/out offsets into the text you should be able to run this heuristic

The problem (and I see this in my work) is that many acronyms have multiple
meanings.  Thus, you may accurately be able to tell that your identified
concept came from an acronym, but it was the wrong concept!!

Peter

On Thu, May 16, 2019 at 4:31 AM Greg Silverman <[email protected]> wrote:

> Got it!
>
> Yes, I understand the formidability, given the need for disambiguation,
> etc. Was just curious if this existed.
>
> Thanks!
>
>
> On Wed, May 15, 2019 at 9:11 PM Finan, Sean <
> [email protected]> wrote:
>
> > Hi Greg,
> >
> > Ok, that gives me a great vector toward addressing your needs.
> >
> > I don't know of any ctakes components that indicate whether or not
> > discovered concepts come from acronyms, abbreviations or -replete- text
> > mentions.
> >
> > There should be something that does that.   Open source ---->  Any
> > champions available?
> >
> > Right now no abbreviation or metonym information is provided in the
> > standard components.    If it can be extruded from source then it should
> be
> > provided.
> >
> > If anybody has such a component, please let us know !   This is a
> > formidable (imio) nlp problem, so call your kudos with a solution!
> >
> > Sean
> >
> > ________________________________________
> > From: Greg Silverman <[email protected]>
> > Sent: Wednesday, May 15, 2019 9:21 PM
> > To: [email protected]
> > Subject: Re: acronyms/abbreviations [EXTERNAL]
> >
> > I'm just wondering how acronyms are identified as acronyms in cTAKES (for
> > example, in MetaMap, there is an attribute in the Document annotation
> with
> > ids of where they are in the Utterance annotation; and in BioMedICUS,
> there
> > is an acronym annotation type, etc.). From examining the XMI CAS, it is
> not
> > obvious.
> >
> > We're extracting the desired annotations from the XMI CAS using a custom
> > Groovy client.
> >
> > Thanks!
> >
> > On Wed, May 15, 2019 at 7:43 PM Finan, Sean <
> > [email protected]> wrote:
> >
> > > Hi Greg,
> > >
> > > What exactly do you need ?
> > >
> > > There are a lot of output components that can produce different formats
> > > containing various types of information.
> > >
> > > Do you prefer to parse ml ?  Or is columnized text output ok?  Does
> this
> > > go to a post-processing engine or a human user?
> > >
> > > Thanks,
> > >
> > > Sean
> > > ________________________________________
> > > From: Greg Silverman <[email protected]>
> > > Sent: Wednesday, May 15, 2019 7:09 PM
> > > To: [email protected]
> > > Subject: acronyms/abbreviations [EXTERNAL]
> > >
> > > How can I get these from the XMI annotations?
> > >
> > > Thanks!
> > >
> > > Greg--
> > >
> > > --
> > > Greg M. Silverman
> > > Senior Systems Developer
> > > NLP/IE <
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Fj9pHse59o_GfrCnR_sqZ7ibEmMju2GDRj6hmEg5s9U&s=taqRUWLVp4l5699x1GSXNfIK6WkZXiAgKnA3CPmlfWk&e=
> > > >
> > > University of Minnesota
> > > [email protected]
> > >
> > >  ›  evaluate-it.org  ‹
> > >
> >
> >
> > --
> > Greg M. Silverman
> > Senior Systems Developer
> > NLP/IE <
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=DSQkibRULBYY2ijgCfGWGPmrKD7gdrLjBbvnTbXozsA&s=pTRmMExWf-ju3IjLOdTelulzu0JW399BumarcAx5tRw&e=
> > >
> > University of Minnesota
> > [email protected]
> >
> >  ›  evaluate-it.org  ‹
> >
>
>
> --
> Greg M. Silverman
> Senior Systems Developer
> NLP/IE <https://healthinformatics.umn.edu/research/nlpie-group>
> University of Minnesota
> [email protected]
>
>  ›  evaluate-it.org  ‹
>

Re: acronyms/abbreviations [EXTERNAL]

Reply via email to