Ctakes to process 5000K recoreds

Nick Nikandish Tue, 09 Sep 2014 11:47:05 -0700

Hi there,

I am using Ctakes to process 5000K free text  records  where each record has 
several medications.
This is the fixed flow that it goes through:


                                                               
<node>SimpleSegmentAnnotator</node>
                                                                
<node>SentenceDetectorAnnotator</node>
                                                                
<node>TokenizerAnnotator</node>
                                                                
<node>LvgAnnotator</node>
                                                                
<node>ContextDependentTokenizerAnnotator</node>
                                                                
<node>POSTagger</node>
                                                                
<node>Chunker</node>
                                                                
<node>LookupWindowAnnotator</node>
                                                                
<node>DictionaryLookupAnnotatorDB</node>
                                                                
<node>DependencyParser</node>
                                                                
<node>AssertionAnnotator</node>
                                                                
<node>ExtractionPrepAnnotator</node>

But it takes very very long time to process that many data( maybe a week or so) 
when I use SimpleSegmentAnnotator.  By eliminating SimpleSegmentAnnotator the 
process is very fast but no medication is being anotated.  Do you guys have any 
suggestion?

Thanks,
Nick

Ctakes to process 5000K recoreds

Reply via email to