To save you a little trouble, in ctakes-temporal we rely a lot on an outside 
library called ClearTK that has some evaluation APIs built in that work well 
with UIMA frameworks and typical NLP tasks. We use the following classes:

The simplest place to start looking in ctakes-temporal is probably the 
EventAnnotator and its evaluation, since they are simple one word spans. Then 
the TimeAnnotator is slightly more complicated with multi-word spans. Then if 
you are interested in evaluating relations I would suggest switching over to 
ctakes-relation-extractor which is more stable than the ctakes-temporal 
relation code, which is an area of highly active (i.e., funded) research and so 
the code has not been cleaned up as much.

From: Leander Melms <>
Sent: Friday, March 17, 2017 3:05 PM
Subject: Re: Evaluate cTAKES perfomance

Thanks! I'll have a look at it and will try to give something back to the 


> On 17 Mar 2017, at 19:42, Finan, Sean <> 
> wrote:
> Ah - you meant best way to test.  Sorry, I misread your inquiry as a best way 
> to write output.
> Yes, that is a great introduction document for ctakes and early tests.  There 
> are a few small test classes in ctakes that read anafora files, run ctakes 
> and run agreement numbers.  You can find some in the ctakes-temporal module.  
> I didn't write them, and I think that they are built-to-fit purpose-driven 
> classes, but you could try to adapt them to a general purpose case.  That 
> would be a great thing to have in ctakes!
> Sean
> -----Original Message-----
> From: Leander Melms []
> Sent: Friday, March 17, 2017 1:46 PM
> To:
> Subject: Re: Evaluate cTAKES perfomance
> Hi Sean,
> thank you (again) for your help and feedback! I'll give it a try! Seems like 
> the authors of the publication "Mayo clinical Text analysis and Knowledge 
> Extraction System" 
> (
> <
>  >) did this as well.
> Thank you
> Leander
>> On 17 Mar 2017, at 18:33, Finan, Sean <> 
>> wrote:
>> Hi Leander,
>> There is no single correct way to do this, but a couple of similar
>> classes exist.  Well, one sat in my sandbox for two years until about 5 
>> seconds ago as I only just checked it in.  Anyway, take a look at two 
>> classes in ctakes-core org.apache.ctakes.core They are TextSpanWriter and 
>> CuiCountFileWriter.
>> TextSpanWriter writes annotation name | span | covered text in a file, one 
>> per document.
>> CuiCountFileWriter writes a list of discovered cuis and their counts.
>> It sounds like you are interested in a combination of both - basically 
>> TextSpanWriter with the added output of CUIs.
>> You can also have a look at EntityCollector of 
>> org.apache.ctakes.core.pipeline.  It has an annotation engine that keeps a 
>> running list of "entities" for the whole run, doc ids, spans, text and cuis.
>> Sean
>> -----Original Message-----
>> From: Leander Melms []
>> Sent: Friday, March 17, 2017 1:09 PM
>> To:
>> Subject: Re: Evaluate cTAKES perfomance
>> Sorry for writing again. I just have a quick question: My idea is to parse 
>> the cTAKES output to a text file with a structure like this 
>> DocName|Spans|CUI|CoveredText|ConceptType and do the same with the cold 
>> standart (from anafora).
>> Is this a correct way to do this?
>> I'm new to the subject and happy about the tiniest information on the topic.
>> Thanks
>> Leander
>> I
>>> On 17 Mar 2017, at 12:05, Leander Melms <> 
>>> wrote:
>>> Hi,
>>> I've integrated a custom dictionary, retrained some of the OpenNLP models 
>>> and would like to evaluate the changes on a gold standard. I'd like to 
>>> calculate the precision, the recall and the f1-score to compare the results.
>>> My question is: Does cTAKES ship with some evaluation / test scripts? What 
>>> is the best strategry to do this? Has anyone dealt with this topic before?
>>> I'm happy to share the results afterwards if there is interest for it.
>>> Thanks
>>> Leander

Reply via email to