I am only interested in medications names so I use cTakes for that sole purpose
for now(the future plan is to use other parts of cTakes) . I don't believe I
am getting any annotation either.
If I only want to use identify the medication/antibiotics name in a text like
this:
Urine culture MOS
Source/Body site Not Indicated . . Culture results . >100,000col/ml
Escherichia coli Final ID . . E.coli Ampicillin
S Ampicillin/sulbactam S Cefazolin S Ciprofloxacin
S Gentamicin S Nitrofurantoin S SXT
S
What is the minimum annotator flow that I have to use? I tried to take out some
of the mentioned flow members but it didn't work.
Nick
-----Original Message-----
From: Masanz, James J. [mailto:[email protected]]
Sent: Tuesday, September 09, 2014 3:05 PM
To: '[email protected]'
Subject: RE: Ctakes to process 5000K recoreds
I suspect that when you take out simple segment annotated, nothing is getting
processed, and that is why it appears so fast. At least some of the annotators
loop through the list of sections/segments, which is why there is a simple
segment annotator - so that there is at least one section/segment identified.
Are you getting any annotations at all?
-----Original Message-----
From: Nick Nikandish [mailto:[email protected]]
Sent: Tuesday, September 09, 2014 2:02 PM
To: [email protected]
Subject: RE: Ctakes to process 5000K recoreds
Pei,
I need the name of the medications for the application that I wrote and uses
ctakes.....so I cache the medication in DictionaryLookupAnnotator(in
performLookup()) and use them in my program but when I have
SimpleSegementAnnotator it just takes forever. After taking
SimpleSegementAnnotator out, no medication name in DictionaryLookupAnnotator is
returned in the code. So I was wondering if there was a way that I could
eliminate SimpleSegementAnnotator but still be able to get the medications
name in that class?
Nick
-----Original Message-----
From: Pei Chen [mailto:[email protected]]
Sent: Tuesday, September 09, 2014 2:54 PM
To: [email protected]
Subject: Re: Ctakes to process 5000K recoreds
Nick,
When you mean no medication is being annotated, I presume you mean the
medication attributes (i.e. dosage, frequency, etc.) are not being annotated?
I think the DrugNER needs a list of section names in the config; I think it
includes SIMPLE_SEGMENT. I am very surprised that SimpleSegementAnnotator is
the bottle neck though; all it does is assume the entire document is a single
section called SIMPLE_SEGMENT.
Have you tried commenting out the DependencyParser if you're not using those
features.
--Pei
On Tue, Sep 9, 2014 at 2:45 PM, Nick Nikandish <[email protected]>
wrote:
>
> Hi there,
>
> I am using Ctakes to process 5000K free text records where each record has
> several medications.
> This is the fixed flow that it goes through:
>
>
> <node>SimpleSegmentAnnotator</node>
>
> <node>SentenceDetectorAnnotator</node>
>
> <node>TokenizerAnnotator</node>
>
> <node>LvgAnnotator</node>
>
> <node>ContextDependentTokenizerAnnotator</node>
>
> <node>POSTagger</node>
>
> <node>Chunker</node>
>
> <node>LookupWindowAnnotator</node>
>
> <node>DictionaryLookupAnnotatorDB</node>
>
> <node>DependencyParser</node>
>
> <node>AssertionAnnotator</node>
>
> <node>ExtractionPrepAnnotator</node>
>
> But it takes very very long time to process that many data( maybe a week or
> so) when I use SimpleSegmentAnnotator. By eliminating SimpleSegmentAnnotator
> the process is very fast but no medication is being anotated. Do you guys
> have any suggestion?
>
> Thanks,
> Nick
>