Re: Parse Medical Research Papers [EXTERNAL]

Miller, Timothy Mon, 18 Jun 2018 04:51:10 -0700

To get predicate argument structure the best method is probably to use the SRL 
(Semantic Role Labeling) annotator which is part of the 
ctakes-dependency-parser module. Check in the desc/ directory in that module 
for some sample pipelines to see its dependencies. Once you have that running, 
look for the types:
org.apache.ctakes.typesystem.type.textsem.Predicate
org.apache.ctakes.typesystem.type.textsem.SemanticArgument
org.apache.ctakes.typesystem.type.textsem.SemanticRoleRelation


in the CVD to get a feel for how predicate arguments are represented in the CAS.
If you are not familiar with SRL maybe check out this demo:
http://cogcomp.org/page/demo_view/SRL
and these slides (specifically the propbank, that is the style cTAKES uses):
https://nlp.stanford.edu/kristina/papers/SRL-Tutorial-post-HLT-NAACL-06.pdf

I believe StanfordNLP has a module to do this too, but of course not trained on 
clinical data and not using the augmented set of verb senses that were created 
by the PropBank team for the clinical domain.

Tim


________________________________________
From: Don Flinn <fl...@alum.mit.edu>
Sent: Monday, June 18, 2018 5:40 AM
To: dev@ctakes.apache.org
Subject: Parse Medical Research Papers [EXTERNAL]

I want to parse medical research papers and am looking at using Ctakes.  I
do realize that Ctkes is aimed at Clinical Reports, but I would like to see
if I can use it for my purposes.  I'm initially looking to get a tuple of
Subject, Predicate, Object for each sentence and later additional semantic
information..

I modified ClinicalPipelineFactory.java to use  the following portion of a
research report -

"A research team based in Houston has developed a prototype for a
“bionic” heart replacement. Other designs all mimic the beating of
a heart, but due to many moving parts, the mechanical hearts
would quickly wear out. The heart developed by BiVACOR does not
beat, and instead has one moving part which propels the blood
throughout the body. The bionic heart has been safely and
successfully transplanted into animals leading to very promising
results."

I got the following result -
Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient ===
Generic? false === Conditional? false === History? false
Entity: replacement === Polarity: 1 === Uncertain? false === Subject:
patient === Generic? false === Conditional? false === History? false
Entity: mimic === Polarity: 0 === Uncertain? false === Subject: null ===
Generic? false === Conditional? false === History? false
Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient ===
Generic? false === Conditional? false === History? false
Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient ===
Generic? false === Conditional? false === History? false
Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient ===
Generic? false === Conditional? false === History? false

I assume my problem is related to the Snomed database, which is not trained
for what I want.

My questions -
Is my assumption correct?
Should I attempt to modify/extend Snomed?
Is there a better/different way to query Snomed to meet my needs?
Is there an existing database that I could use with Ctakes that would more
meet my needs?
Should I instead use the Stanford Java NLP system or the Apache OpenNLP?
I'll still need a database.

Thank you for any suggestions
Don

Re: Parse Medical Research Papers [EXTERNAL]

Reply via email to