Hi Justin,

> To accomplish this I'm using a modified version of the FastPipeline.
>And in the sno_rx_2016ab.xml, instead of DefaultTermConsumer, I'm using the 
>PrecisionTermConsumer.

This is the correct approach for your stated goal.

> cTakes picks up "peripheral arterial disease" but it also picks up "arterial".

cTakes is doing exactly what it is supposed to do.  Overlapping words of 
different semantic types will all be kept:
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+-+Fast+Dictionary+Lookup#cTAKES3.2-FastDictionaryLookup-MostPreciseTermsPersistence
Check to see if "arterial" is something other than a disease/disorder.  
According to my search it is an anatomical site.  It is important that ctakes 
does pick up both the disease and site.

> So is there an even more precise PrecisionTermConsumer or do I need to make 
> my own custom TermConsumer?

No there is not.  However, you can fetch or ignore terms of semantic types that 
are desired.
You can create your own TermConsumer to perform filtering, but my advice is 
that you only perform filtering/fetching when writing output.  You can do this 
by changing your 
>for (IdentifiedAnnotation entity : 
>JCasUtil.select(jcas,IdentifiedAnnotation.class)) 
To:
for (diseaseDisorder entity : 
JCasUtil.select(jcas,DiseaseDisorderMention.class))

Or if you want more than one semantic type, something like:
JCasUtil.select(jcas, IdentifiedAnnotation.class).stream().filter( 
myTypes.contains( Object::getClass ) )
Where myTypes is a collection of your desired semantic types.

Sean



-----Original Message-----
From: Justin Brown [mailto:jb613...@gmail.com] 
Sent: Friday, July 14, 2017 2:54 PM
To: dev@ctakes.apache.org
Cc: Daya Sharma
Subject: Problem with the PrecisionTermConsumer [EXTERNAL]

I'm using the dev version of ctakes 4.0. I'm trying to match a phrase, and 
ignore all subcomponents of the phrase. For instance, if the note is, "Very 
severe peripheral arterial disease." cTakes should pick up only "peripheral 
arterial disease" and not "arterial", "disease", "arterial disease" etc....

To accomplish this I'm using a modified version of the FastPipeline.
Instead of using the DefaultJCasTermAnnotator, I'm using 
OverlapJCasTermAnnotator. And in the sno_rx_2016ab.xml, instead of 
DefaultTermConsumer, I'm using the PrecisionTermConsumer. In theory, the 
overlap annotator and the PrecisionTermConsumer combined should solve the 
problem.

The problem is, when the note is "Very severe peripheral arterial disease."
cTakes picks up "peripheral arterial disease" but it also picks up "arterial".

So is there an even more precise PrecisionTermConsumer or do I need to make my 
own custom TermConsumer? I also tried the SemanticCleanupTermConsumer, but it 
gave the same results.

Here's the code I'm using to extract phrases:

JCas jcas = JCasFactory.createJCas();
jcas.setDocumentText(note);
AggregateBuilder builder = new AggregateBuilder(); 
builder.add(ClinicalPipelineFactory.getFastPipeline());
SimplePipeline.runPipeline(jcas, builder.createAggregateDescription());

for (IdentifiedAnnotation entity : JCasUtil.select(jcas,
IdentifiedAnnotation.class)) {
...
}


Any help is appreciated.

Thank you,
Justin

Reply via email to