[ 
https://issues.apache.org/jira/browse/CTAKES-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Finan closed CTAKES-485.
-----------------------------
    Resolution: Implemented

This implementation is thread safe, but not highly concurrent.  What does that 
mean?  A lot of thread blocking.  So, larger pipelines and longer notes will 
see greater performance because threads are less likely to be attempting to use 
the same annotation engine.  For instance, see below.  The default clinical 
pipeline sees ~25% improvement in performance going from 1 to 2 threads.  Going 
to 3 threads see no improvement over 2.  For a much longer "full" pipeline, 
adding a 3rd thread sees another 6-7% improvement.  Things like disk i/o 
further contribute to the decreasing gain, but it is mostly thread contention.  
What we really need is to make each individual annotator more concurrent, 
reducing or removing the amount of code that needs to be in synchronized blocks.

Just in case you want to test this, please do not think that you will get your 
best performance by "using all of your cores."  Use your core count -1.

On my old HP EliteBook 8440p; 64bit, (2) 2.67 Ghz proc, hyperthreaded (4 core), 
6GB RAM, Windows 7(64b)

Processing time for notes in ctakes-examples, averaging over 3 runs each:

Default Clinical
single: 0:44   100%
2proc: 0:32     73%
3proc: 0:32     73%

Full Pipeline (sections, paragraphs, lists, [default clinical], degree, 
location, event, time, e-t, e-e links, coref)
single: 4:04   100%
2proc: 2:55     72%
3proc: 2:42     66%


> Add Thread safe default clinical pipeline
> -----------------------------------------
>
>                 Key: CTAKES-485
>                 URL: https://issues.apache.org/jira/browse/CTAKES-485
>             Project: cTAKES
>          Issue Type: New Feature
>    Affects Versions: 4.0.1
>            Reporter: Sean Finan
>            Assignee: Sean Finan
>            Priority: Minor
>              Labels: performance
>             Fix For: 4.0.1
>
>
> cTakes is not thread-safe.  This has been well established.  It would be nice 
> if at least the default clinical pipeline could be run with some thread 
> safety.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to