Hi Greg, Peter, I believe that the performance report comes from a CollectionProcessingEngine (CPE) https://uima.apache.org/d/uimaj-current/apidocs/org/apache/uima/collection/CollectionProcessingEngine.html
I think that UIMA's CPE GUI runs the pipeline through a CPE - hence the tool's name, but that may have changed in recent years. The PipelineBuilder class in ctakes.core used by the PiperFileRunner could be changed to use this style of running a single-threaded pipeline - right now it uses a simpler UIMAFit method. The code changes are relatively minor, but obviously significant testing would be required. The ctakes PipelineBuilder does use a CPE for multi-threaded pipelines, so there has already been some testing on that front. You can look at the ctakes PipelineBuilder run() method. If you get rid of the if (threadCount==1) {..} else { the the CPE will always be used. Then just add a cpe.getPerformanceReport() after cpe.process() you should have a ProcessTrace object. This is where my guessing ends as I have never used a ProcessTrace and don't know exactly what to beg of it. I hope that is a decent start, Sean ________________________________________ From: Greg Silverman <g...@umn.edu.INVALID> Sent: Saturday, January 23, 2021 3:01 PM To: dev@ctakes.apache.org Subject: Re: performance report [EXTERNAL] * External Email - Caution * Hi Peter, I have no doubt about performance differences regarding variance between note styles and pipeline components. We're looking for a way to benchmark the standard/non-customized pipeline performance for processing a largish set of identical notes using several clinical NLP annotators (specifically, ctakes, biomedicus, metamap and clamp). At the command line, both metamap and biomedicus output a standard performance report with total timings and the details for each specific pipeline component. I assume there is a way to enable the performance report output available in the GUI version of ctakes at the command line - which is what I'm really interested in. We're fine with information at a very coarse level, since we're interested in a particular note type, so the aforementioned report should be sufficient. I'm just wondering how to enable it using the standard pipeline in cTAKES. Thanks! Greg-- On Sat, Jan 23, 2021 at 12:26 PM Peter Abramowitsch <pabramowit...@gmail.com> wrote: > Hi Greg, > > I’ve found that there’s so much difference between note styles that have > performance implications and so many interactions between pipeline > configurations which affect overall performance, that really the only way > to get a sense of performance is either on a vary coarse level, measuring > process time across large collections of varied notes, or very granular > using something like jvisualvm. Using the latter I saw some surprising > things, some of which I was able to tackle with minor software changes, > while others are deep in UIMA utilities used by cTakes.. The biggest > factor in my experience after processing millions of notes is after they > have reached about 5k AND are missing punctuation. At around this size > begins a geometric rise in complexity of internal structures that depend on > sentences and a serious elevation of processing time. > > Peter > > Sent from my iPad > > > On Jan 23, 2021, at 18:09, Greg Silverman <g...@umn.edu.invalid> wrote: > > > > I found this: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__medium.com_-40felix-5Fchan_install-2Dapache-2Dctakes-2D924c40967ce2&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=s-jUaTKHh4ts1f2UzY5nHsKbjA27HDpqAchBF36juTI&e= > > , which > > states: "A performance report is generated when the process is done." > > > > However, we are running this from the command line and no such report is > > being generated. > > > > Thanks! > > > >> On Sat, Jan 23, 2021 at 11:05 AM Greg Silverman <g...@umn.edu> wrote: > >> > >> Hi all, > >> Is there a way to easily generate a performance report similar to the > one > >> generated by MetaMap (with timings for each task, etc.)? > >> > >> Thanks in advance! > >> > >> Greg-- > >> > >> -- > >> Greg M. Silverman > >> Senior Systems Developer > >> NLP/IE > >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e= > >> > > >> Department of Surgery > >> University of Minnesota > >> g...@umn.edu > >> > >> > > > > -- > > Greg M. Silverman > > Senior Systems Developer > > NLP/IE > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e= > > > > > Department of Surgery > > University of Minnesota > > g...@umn.edu > -- Greg M. Silverman Senior Systems Developer NLP/IE <https://urldefense.proofpoint.com/v2/url?u=https-3A__healthinformatics.umn.edu_research_nlpie-2Dgroup&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=uuvD9Z5PgR1KUWZ1Dc80V19dfKcr2DTrMuBxe2OCbMc&s=5Kgux8IKOmsj2xjj7DxAhKZf6anK7HF3ddsOhnI1VFM&e= > Department of Surgery University of Minnesota g...@umn.edu