Also have a look at Shangridocs:
http://github.com/chrismattmann/shangridocs/ There are a lot of places we could take that software… Cheers, Chris From: andy mcmurry <mcmurry.a...@gmail.com> Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> Date: Monday, June 18, 2018 at 12:18 PM To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> Subject: Re: Parse Medical Research Papers UMLS SemRep and Semantic Medline are probably better resources for parsing papers with respect to Subject, Predicate, Object. https://semrep.nlm.nih.gov/ https://skr3.nlm.nih.gov/SemMed On Mon, Jun 18, 2018 at 2:40 AM, Don Flinn <fl...@alum.mit.edu> wrote: I want to parse medical research papers and am looking at using Ctakes. I do realize that Ctkes is aimed at Clinical Reports, but I would like to see if I can use it for my purposes. I'm initially looking to get a tuple of Subject, Predicate, Object for each sentence and later additional semantic information.. I modified ClinicalPipelineFactory.java to use the following portion of a research report - "A research team based in Houston has developed a prototype for a “bionic” heart replacement. Other designs all mimic the beating of a heart, but due to many moving parts, the mechanical hearts would quickly wear out. The heart developed by BiVACOR does not beat, and instead has one moving part which propels the blood throughout the body. The bionic heart has been safely and successfully transplanted into animals leading to very promising results." I got the following result - Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient === Generic? false === Conditional? false === History? false Entity: replacement === Polarity: 1 === Uncertain? false === Subject: patient === Generic? false === Conditional? false === History? false Entity: mimic === Polarity: 0 === Uncertain? false === Subject: null === Generic? false === Conditional? false === History? false Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient === Generic? false === Conditional? false === History? false Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient === Generic? false === Conditional? false === History? false Entity: heart === Polarity: 1 === Uncertain? false === Subject: patient === Generic? false === Conditional? false === History? false I assume my problem is related to the Snomed database, which is not trained for what I want. My questions - Is my assumption correct? Should I attempt to modify/extend Snomed? Is there a better/different way to query Snomed to meet my needs? Is there an existing database that I could use with Ctakes that would more meet my needs? Should I instead use the Stanford Java NLP system or the Apache OpenNLP? I'll still need a database. Thank you for any suggestions Don