Re: performance report [EXTERNAL]

Finan, Sean Mon, 25 Jan 2021 06:48:51 -0800

Hi Greg, Peter,

I believe that the performance report comes from a CollectionProcessingEngine 
(CPE) 
https://uima.apache.org/d/uimaj-current/apidocs/org/apache/uima/collection/CollectionProcessingEngine.html

I think that UIMA's CPE GUI runs the pipeline through a CPE - hence the tool's 
name, but that may have changed in recent years.

The PipelineBuilder class in ctakes.core used by the PiperFileRunner could be 
changed to use this style of running a single-threaded pipeline - right now it 
uses a simpler UIMAFit method.
The code changes are relatively minor, but obviously significant testing would 
be required.  The ctakes PipelineBuilder does use a CPE for multi-threaded 
pipelines, so there has already been some testing on that front.

You can look at the ctakes PipelineBuilder run() method.  If you get rid of the 
if (threadCount==1) {..} else {   the the CPE will always be used.  Then just 
add a cpe.getPerformanceReport() after cpe.process() you should have a 
ProcessTrace object.  This is where my guessing ends as I have never used a 
ProcessTrace and don't know exactly what to beg of it.

I hope that is a decent start,
Sean
________________________________________
From: Greg Silverman <g...@umn.edu.INVALID>
Sent: Saturday, January 23, 2021 3:01 PM
To: dev@ctakes.apache.org
Subject: Re: performance report [EXTERNAL]

* External Email - Caution *

Hi Peter,
I have no doubt about performance differences regarding variance between
note styles and pipeline components.

We're looking for a way to benchmark the standard/non-customized pipeline
performance for processing a largish set of identical notes using several
clinical NLP annotators (specifically, ctakes, biomedicus, metamap and
clamp). At the command line, both metamap and biomedicus output a standard
performance report with total timings and the details for each specific
pipeline component. I assume there is a way to enable the performance
report output available in the GUI version of ctakes at the command line -
which is what I'm really interested in.

We're fine with information at a very coarse level, since we're interested
in a particular note type, so the aforementioned report should be
sufficient. I'm just wondering how to enable it using the standard pipeline
in cTAKES.

Thanks!

Greg--

On Sat, Jan 23, 2021 at 12:26 PM Peter Abramowitsch <pabramowit...@gmail.com>
wrote:

> Hi Greg,
>
> I’ve found that there’s so much difference between note styles that have
> performance implications and so many interactions between pipeline
> configurations which affect overall performance, that really the only way
> to get a sense of performance is either on a vary coarse level, measuring
> process time across large collections of varied notes, or very granular
> using something like jvisualvm.   Using the latter I saw some surprising
> things, some of which I was able to tackle with minor software changes,
> while others are deep in UIMA utilities used by cTakes..  The biggest
> factor in my experience after processing millions of notes is after they
> have reached about 5k AND are missing punctuation.  At around this size
> begins a geometric rise in complexity of internal structures that depend on
> sentences and a serious elevation of processing time.
>
> Peter
>
> Sent from my iPad
>
> > On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid> wrote:
> >
> > I found this:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__medium.com_-40felix-5Fchan_install-2Dapache-2Dctakes-2D924c40967ce2&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=s-jUaTKHh4ts1f2UzY5nHsKbjA27HDpqAchBF36juTI&e=
> >  , which
> > states: "A performance report is generated when the process is done."
> >
> > However, we are running this from the command line and no such report is
> > being generated.
> >
> > Thanks!
> >
> >> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu> wrote:
> >>
> >> Hi all,
> >> Is there a way to easily generate a performance report similar to the
> one
> >> generated by MetaMap (with timings for each task, etc.)?
> >>
> >> Thanks in advance!
> >>
> >> Greg--
> >>
> >> --
> >> Greg M. Silverman
> >> Senior Systems Developer
> >> NLP/IE 
> >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> >>  >
> >> Department of Surgery
> >> University of Minnesota
> >> g...@umn.edu
> >>
> >>
> >
> > --
> > Greg M. Silverman
> > Senior Systems Developer
> > NLP/IE 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
> >  >
> > Department of Surgery
> > University of Minnesota
> > g...@umn.edu
>

--
Greg M. Silverman
Senior Systems Developer
NLP/IE 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e=
 >
Department of Surgery
University of Minnesota
g...@umn.edu

Re: performance report [EXTERNAL]

Reply via email to