Hi Sreejith,
Without seeing an example of text I can't say whether my next words will help
you or not.
If you are using trunk then you should have access to two 'new' annotation
engines in ctakes-core.
ListAnnotator - Annotates formatted List Sections by detecting them
using Regular Expressions provided in an input File.
ListEntryNegator - Checks List Entries for negation, which may be exhibited
differently from unstructured negation.
ListAnnotator can use any list of regular expressions in a file. The default
file is in ctakes-core-res, called DefaultListRegex.bsv
The format for each line in the regex list is
NAME||LIST_REGEX||ENTRY_SEPARATOR_REGEX where
NAME - name of list type. Can be anything.
LIST_REGEX - some regular expression for which a block of text will match a
list in its entirety.
ENTRY_SEPARATOR_REGEX - some regular expression for which text within the
entire list will match a single list entry.
For instance, the List
Smoker Status: N
Drinking Status: Y
Pregnant: N/A
A -simple- line in the regex file could be
Colonized
List||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n){2,}||(?:^(?:[^\r\n:]+:[^\r\n:]+)+\r?\n)
Notice that each item is separated by two bar characters "||".
The file of regular expressions can be changed using the LIST_TYPES_PATH
parameter.
ListEntryNegator will iterate through each ListEntry in the cas and use a
regular expression to determine whether or not items in the list should be
negated.
Right now that regex is hard-coded in the class. There should probably be a
mechanism to overwrite it. ": N" is not in there. Also, only
Disease/Disorders and Sign/Symptom mentions in the ListEntry are negated. You
would need to add SmokingStatusAnnotation as a negatable.
I don't know if any of this is helpful, but I thought that I would throw it out
there.
Sean
________________________________________
From: Sreejith Pk <[email protected]>
Sent: Friday, July 24, 2020 4:09 AM
To: [email protected]
Subject: Re: Clarification regarding NegationFSM [EXTERNAL]
* External Email - Caution *
Hi Peter, Thanks a lot for the reply.
Let me elaborate more on the changes I have done so far. I have added
KuRuleBasedClassifierAnnotator to the pipeline inorder to fetch Smoking
related keywords from the document. I have
modified KuRuleBasedClassifierAnnotator in such a way that it will iterate
through the identified tokens and if the token matches any smoking related
word which are configured inside a keyword.txt file. The identified tokens
will be then set to SmokerNamedEntityAnnotation and thus can be read from
the output XMI.
Here in my scenario, the sentence I am passing to cTAKES is "Smoking
status: N". As Smoking is configured inside keywords.txt, it will be coming
as the output node in SmokerNamedEntityAnnotation. Its polarity only I am
parsing in my parser logic. Here polarity of SmokerNamedEntityAnnotation
- "Smoking" token is coming as 1 instead of expected -1
(NB: I have removed ":" from the NamedEntityContextAnalizer.java - boundary
words set)
Thanks and Regards,
Sreejith
On Thu, Jul 23, 2020 at 11:20 PM Peter Abramowitsch <[email protected]>
wrote:
> Check and see if the identified annotation you get for "Smoking status: N"
> without your change is actually "Non Smoker" with polarity 1.
> Nonsmoker is a separate concept, from a Smoker with polarity -1. Instead
> of looking at range text, check the canonical text for the concept you
> have.
> Having said that, there are many issues with negation in all of the
> negation annotators. Some are too eager, others are too cautious.
>
> Peter
>
> On Thu, Jul 23, 2020 at 10:17 AM Sreejith Pk <[email protected]> wrote:
>
> > Hi Team,
> >
> > We are using cTAKES 4.0.0 as the NLP engine in our application. I have
> > added ContextAnnotator to the pipeline to achieve correct Polarity to the
> > tokens.
> > After analysing the ContextAnnotator code, I understand that negation
> > determining condition is written in NegationFSM class.
> > In my requirement, I have a sentence "Smoking status: N" and I want to
> set
> > polarity -1 to the token "Smoking" because of the occurrence of "N". To
> > achieve the same, I have tried adding "N" to the existing HashSet
> > in NegationFSM constructor like iv_negVerbsSet.add("N"); But it seems,
> > polarity of the word token "Smoking" is still coming as 1.
> > With the same configuration set if I pass "Smoking status: denies", I am
> > getting the polarity of token "Smoking" as -1. Kindly help.
> >
> > Thanks & Regards
> > Sreejith
> >
>