I'm using the dev version of ctakes 4.0. I'm trying to match a phrase, and
ignore all subcomponents of the phrase. For instance, if the note is, "Very
severe peripheral arterial disease." cTakes should pick up only "peripheral
arterial disease" and not "arterial", "disease", "arterial disease" etc....
To accomplish this I'm using a modified version of the FastPipeline.
Instead of using the DefaultJCasTermAnnotator, I'm using
OverlapJCasTermAnnotator. And in the sno_rx_2016ab.xml, instead
of DefaultTermConsumer, I'm using the PrecisionTermConsumer. In theory, the
overlap annotator and the PrecisionTermConsumer combined should solve the
problem.
The problem is, when the note is "Very severe peripheral arterial disease."
cTakes picks up "peripheral arterial disease" but it also picks up
"arterial".
So is there an even more precise PrecisionTermConsumer or do I need to make
my own custom TermConsumer? I also tried the SemanticCleanupTermConsumer,
but it gave the same results.
Here's the code I'm using to extract phrases:
JCas jcas = JCasFactory.createJCas();
jcas.setDocumentText(note);
AggregateBuilder builder = new AggregateBuilder();
builder.add(ClinicalPipelineFactory.getFastPipeline());
SimplePipeline.runPipeline(jcas, builder.createAggregateDescription());
for (IdentifiedAnnotation entity : JCasUtil.select(jcas,
IdentifiedAnnotation.class)) {
...
}
Any help is appreciated.
Thank you,
Justin