Thanks Sean. Not quite, sorry for the confusion. We keep the default
dictionary hsqldb. We just empty the CUI_TERMS, RXNORM, PREFTERM, and TUI
tables and move over data from a sql server db. I don't seem to recall
doing anything with a tcount column. I'll have to check our code tonight.
That could very well be it. So maybe the old ctakes had a bug and this
should not have been working to begin with. Got anywhere I could read about
the tokenizing rules and calculating the tcount value? Or maybe a java
class I could look at?

Jeff


On Tue, Oct 3, 2017 at 9:07 AM, Finan, Sean <
[email protected]> wrote:

> Ok, let me see if I understand your current setup:
>
> Ctakes 4.0 fast lookup,
> Dictionary configuration file points to an sql server,
> Sql server uses cui_terms  (cui, rword, rindex, tcount, text) and perhaps
> other secondary tables
> ...
>
> Now that I write out the column names, I have a thought.  Is it possible
> that for some term the number in tcount does not match the number of
> non-whitespace 'words' in the text column?  If those numbers are off then
> you will have problems similar to the one that you are seeing.
> If you are populating your own table you need to make sure that the text
> is being properly tokenized.  For instance, the term "alpha-beta" should
> have text "alpha - beta" with tcount 3.  There are some exceptions to the
> dash -separation rule and a few oddities.
>
> Sean
>
> -----Original Message-----
> From: Jeff Headley [mailto:[email protected]]
> Sent: Tuesday, October 03, 2017 8:52 AM
> To: [email protected]
> Subject: Re: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]
>
> I updated our pom to use the same hsqldb version as what I saw in the
> ctakes lib folder. The data coming in is from a SQL Server database.
>
> On Tue, Oct 3, 2017 at 8:45 AM, Finan, Sean <
> [email protected]> wrote:
>
> > Hi Jeff,
> >
> > I don't think that a custom dictionary should cause a null pointer
> > exception on that line unless you have an odd null character in text
> > or something of that ilk.
> >
> > One thing that changed in ctakes 4.0 is the version of hsqldb that is
> > being used for the dictionary database.  I don’t know if that has
> > anything to do with your problem, but it may be causing others.
> > What is the source of your custom dictionary?  There may be a better
> > way to populate a database.
> >
> > Sean
> >
> > -----Original Message-----
> > From: Jeff Headley [mailto:[email protected]]
> > Sent: Tuesday, October 03, 2017 12:53 AM
> > To: [email protected]
> > Subject: Re: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]
> >
> > Thank you Sean. That helped to figure out what we did. Not quite sure
> > where we went wrong but now at least we know the cause. So a long time
> > ago in our project using ctakes, we emptied out the tables CUI_TERMS,
> > RXNORM, PREFTERM, and TUI and then loaded them with the values we
> > wanted. Worked great. Now in the new version the
> > /desc/ctakes-clinical-
> > pipeline/desc/analysis_engine/AggregatePlaintextFastUMLSProcessor.xml
> > engine seems to be
> > using /resources/org/apache/ctakes/dictionary/lookup/fast/sno_rx_
> > 16ab/sno_rx_16ab
> > and that seems to be where things went sideways. If I don't mess with
> > the db and keep the original, no errors.
> >
> > So somewhere in this if statement at line 102 in
> DefaultJCASTermAnnotator:
> > if ( hitTokens[ hit ].equals( allTokens.get( i ).getText() )
> >               || hitTokens[ hit ].equals( allTokens.get( i
> > ).getVariant() )
> > ) {
> >
> > It's expecting to not ever have a null and I suspect we are leaving
> > something null somewhere that really shouldn't have nulls. If it's
> > obvioius where I've went wrong, the assistance would be appreciated.
> > Otherwise, I'll get it figured out eventually. I suspect it's possibly
> > because we never did anything with the SNOMEDCT_US in the prior version.
> >
> > On Mon, Oct 2, 2017 at 10:47 AM, Finan, Sean <
> > [email protected]> wrote:
> >
> > > Hi Jeff,
> > >
> > > I have no problem running on your example "DIDANOSINE, 250MG (PO
> > > Capsule Delayed Release)" or any other text.
> > >
> > > I don't know how you  are running ctakes through
> > com.clientproject.ctakes.
> > > processors.CommandLineProcessor, so I don't know how closely the
> > > standard pipeline approximates yours.
> > >
> > > Sean
> > >
> > > -----Original Message-----
> > > From: Jeff Headley [mailto:[email protected]]
> > > Sent: Sunday, October 01, 2017 11:31 PM
> > > To: [email protected]
> > > Subject: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]
> > >
> > > After upgrading our project to version 4, we are getting a NPE from
> > cTAKES.
> > > The text that was being processed was DIDANOSINE, 250MG (PO Capsule
> > > Delayed Release), though it seems to be happening to us no matter
> > > what text we submit.  The stack trace is below. Any help would be
> > > appreciated as I'm at a loss at to what we might be doing wrong if
> > > this
> > is not a bug in cTAKES.
> > >
> > > Thank you,
> > > Jeff
> > >
> > > Oct 01, 2017 11:10:16 PM
> > > org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl
> > > processAndOutputNewCASes(273)
> > > SEVERE: Exception occurred
> > > org.apache.uima.analysis_engine.AnalysisEngineProcessException:
> > > Annotator processing failed.
> > > at
> > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.
> > > callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:412)
> > > at
> > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.
> > > processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314)
> > > at
> > > org.apache.uima.analysis_engine.asb.impl.ASB_impl$
> AggregateCasIterator.
> > > processUntilNextOutputCas(ASB_impl.java:570)
> > > at
> > > org.apache.uima.analysis_engine.asb.impl.ASB_impl$
> > > AggregateCasIterator.<init>(ASB_impl.java:412)
> > > at
> > > org.apache.uima.analysis_engine.asb.impl.ASB_impl.
> > > process(ASB_impl.java:344)
> > > at
> > > org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.
> > > processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
> > > at
> > > org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(
> > > AnalysisEngineImplBase.java:269)
> > > at
> > > org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(
> > > AnalysisEngineImplBase.java:284)
> > > at
> > > com.clientproject.ctakes.processors.CommandLineProcessor.processLine
> > > (
> > > CommandLineProcessor.java:163)
> > > at
> > > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.
> > > java:1374)
> > > at
> > > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.
> > > java:580)
> > > at
> > > com.clientproject.ctakes.processors.CommandLineProcessor.run(
> > > CommandLineProcessor.java:114)
> > > at com.clientproject.ctakes.App.main(App.java:109)
> > > Caused by: java.lang.NullPointerException at
> > > org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator.
> > > isTermMatch(DefaultJCasTermAnnotator.java:102)
> > > at
> > > org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator.
> > > findTerms(DefaultJCasTermAnnotator.java:79)
> > > at
> > > org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.
> > > findTerms(AbstractJCasTermAnnotator.java:236)
> > > at
> > > org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.
> > > processWindow(AbstractJCasTermAnnotator.java:219)
> > > at
> > > org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.pr
> > > oc
> > > ess(
> > > AbstractJCasTermAnnotator.java:156)
> > > at
> > > org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(
> > > JCasAnnotator_ImplBase.java:48)
> > > at
> > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.
> > > callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396)
> > > ... 12 more
> > >
> >
>

Reply via email to