Hi All

I've noticed that the HistoryCleartkAnalysisEngine misses many common forms
of subject history including the obvious "h/o" prefix.    Looking into the
distribution, there's a model.jar and what  appears to be a weights file
containing trigger words:
resources/org/apache/ctakes/assertion/models/history.txt   where h, o, /
are all given their own weights.   But I'm not sure that they're actually
used in this way:  see below.   However, there's also a tiny file:
/org/apache/ctakes/assertion/semantic_classes/history.txt
which does contain a few entries including "h/o" which I assume is used for
training but is never referred to anywhere.

Here's the behavior I'm seeing:
example input condition term found history feature marked range text
history of pregnancies "history of" included in the cu_term and prefterm yes
  no history of pregnancies
history of adenopathy "history of" not included in the cu_term or prefterm
yes yes adenopathy
H/O postpartum psychosis "h/o" not included in the prefterm or cu_term yes
yes postpartum psychosis
H/O: postpartum psychosis "h/o" not included in the prefterm or cu_term yes
no postpartum psychosis
H/O pregnancies "h/o"  included in the  cu_term yes no h/o pregnancies

You can see that it is quite perverse -  there is a pattern suggesting that
if the concept definition occupies the history words, then they cannot be
seen by the history annotation engine.

Has anyone else noticed this - and have they done anything about it?

Peter

Reply via email to