No, SHARPn was a later project. I'm not sure if there is any overlap in the
datasets.
There are 2 ways to look at the features, one is to read this paper:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0112774
and another is to look at the source:
http://svn.apache.org/viewvc/ctakes/trunk/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/cleartk/AssertionCleartkAnalysisEngine.java?view=markup
Tim
-Original Message-
From: ouyeyu panyu
mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to:
To: u...@ctakes.apache.org<mailto:u...@ctakes.apache.org>
Cc: dev@ctakes.apache.org
mailto:%22...@ctakes.apache.org%22%20%3c...@ctakes.apache.org%3e>>
Subject: Re: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 08:09:06 -0800
Hi Timothy,
Thank you very much for the quick response.
https://pdfs.semanticscholar.org/8f2c/a8b638d216a3e9ec10cd1c21bdaeaa74a229.pdf<https://urldefense.proofpoint.com/v2/url?u=https-3A__pdfs.semanticscholar.org_8f2c_a8b638d216a3e9ec10cd1c21bdaeaa74a229.pdf=DwMFaQ=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h=bdfSiGGOpy6_mnRe0CZd0-wjjUpY-DH7SrOU5_WMkZE=UhoZqDN8rO9tb4R791cI7gKRT7zn_O2yZ8VZpbsD3Ek=>
says
The Mayo-derived linguistically annotated corpus (Mayo) was developed in-house
and consisted of 273 clinical notes (100 650 tokens; 7299 sentences; 61
consult; 1 discharge summary; 4 educational visit; 4 general medical
examination; 48 limited exam; 19 multi-system evaluation; 43 miscellaneous; 1
preoperative medical evaluation; 3 report; 3 specialty evaluation; 5 dismissal
summary; 73 subsequent visit; 5 therapy; 3 test-oriented miscellaneous).
Is SHARPn based on the aforementioned 273 clinical notes?
Also is there a way for me to look into the trained SVM model? Say what are
features there and their weights?
Best,
Yu Pan
On Wed, Jan 16, 2019 at 7:58 AM Miller, Timothy
mailto:timothy.mil...@childrens.harvard.edu>>
wrote:
It uses an SVM model. The training data is from a project called SHARPn, it is
notes from Mayo Clinic with a variety of note types and specialties represented.
As for the example, is it a real example that someone wrote "Deny hepatitis"?
That sounds more like a command than documentation of a negated concept
("denies" or "denied" would seem more common?). Even if that is a real example,
I think it's unusual enough that there are probably not examples of "Deny X" in
the training data.
Tim
-Original Message-
From: ouyeyu panyu
mailto:ouyeyu%20panyu%20%3couy...@gmail.com%3e>>
Reply-to: mailto:u...@ctakes.apache.org>>
To: u...@ctakes.apache.org<mailto:u...@ctakes.apache.org>,
dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: Question about negation [EXTERNAL]
Date: Wed, 16 Jan 2019 07:51:20 -0800
Hi ctakes dev team,
I have one question, hope someone can help me with it.
For negation, "Denies hepatitis” returns polarity=-1, but "Deny hepatitis”
returns polarity=1.
It is said CTAKES uses ClearTK’s PolarityCleartkAnalysisEngine for negation,
which is machine learning based.
It seems this issue is caused by the training data. Is this true? And what is
the training data and what machine learning algorithm is used? LogisticRegress,
SVM, RandomForest or something else?
Thanks.