No, SHARPn was a later project. I'm not sure if there is any overlap in the 
datasets.

There are 2 ways to look at the features, one is to read this paper:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0112774

and another is to look at the source:
http://svn.apache.org/viewvc/ctakes/trunk/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/cleartk/AssertionCleartkAnalysisEngine.java?view=markup

Tim

-----Original Message-----
From: ouyeyu panyu 
<ouy...@gmail.com<mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to: <u...@ctakes.apache.org>
To: u...@ctakes.apache.org<mailto:u...@ctakes.apache.org>
Cc: dev@ctakes.apache.org 
<dev@ctakes.apache.org<mailto:%22...@ctakes.apache.org%22%20%3c...@ctakes.apache.org%3e>>
Subject: Re: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 08:09:06 -0800

Hi Timothy,

Thank you very much for the quick response.

https://pdfs.semanticscholar.org/8f2c/a8b638d216a3e9ec10cd1c21bdaeaa74a229.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__pdfs.semanticscholar.org_8f2c_a8b638d216a3e9ec10cd1c21bdaeaa74a229.pdf&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=bdfSiGGOpy6_mnRe0CZd0-wjjUpY-DH7SrOU5_WMkZE&s=UhoZqDN8rO9tb4R791cI7gKRT7zn_O2yZ8VZpbsD3Ek&e=>
 says
The Mayo-derived linguistically annotated corpus (Mayo) was developed in-house 
and consisted of 273 clinical notes (100 650 tokens; 7299 sentences; 61 
consult; 1 discharge summary; 4 educational visit; 4 general medical 
examination; 48 limited exam; 19 multi-system evaluation; 43 miscellaneous; 1 
preoperative medical evaluation; 3 report; 3 specialty evaluation; 5 dismissal 
summary; 73 subsequent visit; 5 therapy; 3 test-oriented miscellaneous).

Is SHARPn based on the aforementioned 273 clinical notes?
Also is there a way for me to look into the trained SVM model? Say what are 
features there and their weights?

Best,
Yu Pan


On Wed, Jan 16, 2019 at 7:58 AM Miller, Timothy 
<timothy.mil...@childrens.harvard.edu<mailto:timothy.mil...@childrens.harvard.edu>>
 wrote:
It uses an SVM model. The training data is from a project called SHARPn, it is 
notes from Mayo Clinic with a variety of note types and specialties represented.

As for the example, is it a real example that someone wrote "Deny hepatitis"? 
That sounds more like a command than documentation of a negated concept 
("denies" or "denied" would seem more common?). Even if that is a real example, 
I think it's unusual enough that there are probably not examples of "Deny X" in 
the training data.

Tim


-----Original Message-----
From: ouyeyu panyu 
<ouy...@gmail.com<mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to: <u...@ctakes.apache.org<mailto:u...@ctakes.apache.org>>
To: u...@ctakes.apache.org<mailto:u...@ctakes.apache.org>, 
dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 07:51:20 -0800

Hi ctakes dev team,

I have one question, hope someone can help me with it.
For negation, "Denies hepatitis” returns polarity=-1, but "Deny hepatitis” 
returns polarity=1.
It is said CTAKES uses ClearTK’s PolarityCleartkAnalysisEngine for negation, 
which is machine learning based.
It seems this issue is caused by the training data. Is this true? And what is 
the training data and what machine learning algorithm is used? LogisticRegress, 
SVM, RandomForest or something else?
Thanks.

Reply via email to