gold standard annotations for Apache cTAKES sample notes

2014-12-05 Thread Savova, Guergana
Thanks to John Green, we now have sample clinical notes in cTAKES. Many thanks, John, for your effort! We will take these notes and will start generating gold annotations that could be used then to compare cTAKES output to. We are planning to include annotations for: 1. Entities with

RE: revamping the Apache cTAKES website

2014-12-05 Thread Savova, Guergana
Wonderful, thank you, Michelle! There will be a flurry of emails the week of Dec 15 followed by actual work, so book your calendar if possible... --Guergana -Original Message- From: Michelle Chen [mailto:michelle1919c...@gmail.com] Sent: Friday, December 05, 2014 11:48 AM To:

Re: gold standard annotations for Apache cTAKES sample notes

2014-12-05 Thread andy mcmurry
This is great, thanks John green ! On Dec 5, 2014 8:32 AM, Savova, Guergana guergana.sav...@childrens.harvard.edu wrote: Thanks to John Green, we now have sample clinical notes in cTAKES. Many thanks, John, for your effort! We will take these notes and will start generating gold annotations

Scaling cTakes

2014-12-05 Thread Geise, Brandon D.
Hi, I'm new to cTakes and the UIMA framework. I've read most of the UIMA documentation and was able to take the BagofCUIGenerator example and modify to read notes from a DB, process using the UMLS AE in the clinical-pipeline using a local DB version of UMLS, and output the CUIs to a DB.

RE: Scaling cTakes

2014-12-05 Thread Finan, Sean
Hi Brandon, It sounds like you've got a decent pipeline set up. To increase the speed you could try swapping out use of ctakes-dictionary-lookup with ctakes-dictionary-lookup-fast in the AE. Check ctakes-clinical-pipeline/desc/[ae]/AggregatePlaintextFastUMLSProcessor.xml for an example.

RE: Scaling cTakes

2014-12-05 Thread Savova, Guergana
Hi Brandon, Our estimate of how long it takes to process a document is under a second with the fast dictionary lookup I believe. Sean can provide more details. --Guergana -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 05, 2014 1:21

RE: Scaling cTakes

2014-12-05 Thread Geise, Brandon D.
Thanks Sean. I'll take a look and see if this speeds the pipeline up. Thanks, Brandon -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 05, 2014 1:14 PM To: dev@ctakes.apache.org Subject: RE: Scaling cTakes Hi Brandon, It sounds

Re: Scaling cTakes

2014-12-05 Thread jay vyas
on a tangential note, we do have example of running ctakes in a massively parallel system like spark/hadoop. https://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-spark-streaming-twitter/ if you're problem is embarrasingly parallelizable, you can use mapreduce/spark to distribute your app using

RE: Scaling cTakes

2014-12-05 Thread Geise, Brandon D.
Thanks Jay, I'll have to take a look at this too. -Original Message- From: jay vyas [mailto:jayunit100.apa...@gmail.com] Sent: Friday, December 05, 2014 2:40 PM To: dev@ctakes.apache.org Subject: Re: Scaling cTakes on a tangential note, we do have example of running ctakes in a