Re: Looking for cTakes deployment strategies [EXTERNAL]

2019-01-16 Thread Miller, Timothy
Hi Anusha,
I've been working on a project that hasn't merged with ctakes yet, but has a 
github page:
https://github.com/tmills/ctakes-docker

it is a work in progress and so documentation is not great, but I've used it to 
do exactly what you're asking about -- setup a ctakes cluster on AWS to process 
millions of notes.

See the README for a general introduction and then take a look at the script 
bin/launch_cluster.sh

Tim


-Original Message-
From: Anusha Balasubramaniam 
mailto:anusha%20balasubramaniam%20%3canus...@foreseemed.com%3e>>
Reply-to: 
To: dev@ctakes.apache.org
Subject: Looking for cTakes deployment strategies [EXTERNAL]
Date: Wed, 16 Jan 2019 10:40:55 -0800


Hello everyone,

I am looking for a strategy to use cTakes to asynchronously process
thousands of clinical notes by listening to a queue on AWS and maintaining
a hot process with all the dictionaries loaded in memory. So far I've had
some success using the REST server wrapper I found here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dirkweissenborn_ctakes-2Dserver=DwIBaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h=YqHlEhy_rtyv1ECpkh6Nju79T2jpGNkfIfaDhI6C4nw=49CVRWzKU6zTCFHD70RiQCbBdtOLb9uZHsNa3HY7hg4=,
 but it's still a
synchronous call, which I found hard to scale.
Are there any other wrappers out there that could be used to enable cTakes
to listen to a port for input? Can anyone share some strategies they used
to implement cTakes on AWS to achieve similar requirements?

Thanks and Regards,
Anusha



Re: Question about negation [EXTERNAL]

2019-01-16 Thread Miller, Timothy
No, SHARPn was a later project. I'm not sure if there is any overlap in the 
datasets.

There are 2 ways to look at the features, one is to read this paper:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0112774

and another is to look at the source:
http://svn.apache.org/viewvc/ctakes/trunk/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/cleartk/AssertionCleartkAnalysisEngine.java?view=markup

Tim

-Original Message-
From: ouyeyu panyu 
mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to: 
To: u...@ctakes.apache.org
Cc: dev@ctakes.apache.org 
mailto:%22...@ctakes.apache.org%22%20%3c...@ctakes.apache.org%3e>>
Subject: Re: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 08:09:06 -0800

Hi Timothy,

Thank you very much for the quick response.

https://pdfs.semanticscholar.org/8f2c/a8b638d216a3e9ec10cd1c21bdaeaa74a229.pdf
 says
The Mayo-derived linguistically annotated corpus (Mayo) was developed in-house 
and consisted of 273 clinical notes (100 650 tokens; 7299 sentences; 61 
consult; 1 discharge summary; 4 educational visit; 4 general medical 
examination; 48 limited exam; 19 multi-system evaluation; 43 miscellaneous; 1 
preoperative medical evaluation; 3 report; 3 specialty evaluation; 5 dismissal 
summary; 73 subsequent visit; 5 therapy; 3 test-oriented miscellaneous).

Is SHARPn based on the aforementioned 273 clinical notes?
Also is there a way for me to look into the trained SVM model? Say what are 
features there and their weights?

Best,
Yu Pan


On Wed, Jan 16, 2019 at 7:58 AM Miller, Timothy 
mailto:timothy.mil...@childrens.harvard.edu>>
 wrote:
It uses an SVM model. The training data is from a project called SHARPn, it is 
notes from Mayo Clinic with a variety of note types and specialties represented.

As for the example, is it a real example that someone wrote "Deny hepatitis"? 
That sounds more like a command than documentation of a negated concept 
("denies" or "denied" would seem more common?). Even if that is a real example, 
I think it's unusual enough that there are probably not examples of "Deny X" in 
the training data.

Tim


-Original Message-
From: ouyeyu panyu 
mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to: mailto:u...@ctakes.apache.org>>
To: u...@ctakes.apache.org, 
dev@ctakes.apache.org
Subject: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 07:51:20 -0800

Hi ctakes dev team,

I have one question, hope someone can help me with it.
For negation, "Denies hepatitis” returns polarity=-1, but "Deny hepatitis” 
returns polarity=1.
It is said CTAKES uses ClearTK’s PolarityCleartkAnalysisEngine for negation, 
which is machine learning based.
It seems this issue is caused by the training data. Is this true? And what is 
the training data and what machine learning algorithm is used? LogisticRegress, 
SVM, RandomForest or something else?
Thanks.



Re: Question about negation [EXTERNAL]

2019-01-16 Thread Miller, Timothy
It uses an SVM model. The training data is from a project called SHARPn, it is 
notes from Mayo Clinic with a variety of note types and specialties represented.

As for the example, is it a real example that someone wrote "Deny hepatitis"? 
That sounds more like a command than documentation of a negated concept 
("denies" or "denied" would seem more common?). Even if that is a real example, 
I think it's unusual enough that there are probably not examples of "Deny X" in 
the training data.

Tim


-Original Message-
From: ouyeyu panyu 
mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to: 
To: u...@ctakes.apache.org, 
dev@ctakes.apache.org
Subject: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 07:51:20 -0800

Hi ctakes dev team,

I have one question, hope someone can help me with it.
For negation, "Denies hepatitis” returns polarity=-1, but "Deny hepatitis” 
returns polarity=1.
It is said CTAKES uses ClearTK’s PolarityCleartkAnalysisEngine for negation, 
which is machine learning based.
It seems this issue is caused by the training data. Is this true? And what is 
the training data and what machine learning algorithm is used? LogisticRegress, 
SVM, RandomForest or something else?
Thanks.