FWIW, I started taking a look at the patch. (It's in code that I'm not that familiar with, so a quick glance isn't sufficient for me.) I did a search in UMLS for m2 in the terminologies commonly used by cTAKES to see if adding m2 could result in marking something as a measurement when it's not - and I did find many terms in the UMLS that contain m2. There are plenty of other measurement abbreviations that also appear within other terms, so it's not a showstopper - but is a consideration.
I haven't tested the patch yet to see if the way the patch is implemented - checking for 2 tokens - avoids that issue. Not sure if I'll get a chance to look more this week. if you end up picking up looking at it Sean, at least you know what I've done. -- James On Tue, Oct 3, 2017 at 12:25 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Gandhi, > > Ctakes is a purely volunteer effort, so there are never any guarantees ... > If nobody looks at the value and unit jira and patch this week then I will > try to get to it asap. > > Thanks for letting us use your example note! > > Sean > > -----Original Message----- > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > Sent: Tuesday, October 03, 2017 12:21 PM > To: dev@ctakes.apache.org > Subject: RE: Enabling drugner pipeline and identifying dates [EXTERNAL] > [SUSPICIOUS] > > Hi Sean, > > > > Will this JIRA issue - https://urldefense.proofpoint. > com/v2/url?u=https-3A__issues.apache.org_jira_browse_CTAKES- > 2D459&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r= > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=EPRi2YznX0T5F4yYV0y2OmCxU0Q_ > Gx24B_omGRWF8kg&s=fhwLqbd8Tgg6z-jFe9Z7t0baNz2YgNwM-SCSeTnrZes&e= be > looked up by someone as Tim mentioned? > > > > The paragraph we sent earlier can be in the example notes provided the > protocol number is masked/modified. > > > > Regards, > > Gandhi > > > > > > -----Original Message----- > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > Sent: Tuesday, October 03, 2017 7:27 PM > > To: dev@ctakes.apache.org > > Subject: RE: Enabling drugner pipeline and identifying dates [EXTERNAL] > [SUSPICIOUS] > > > > Hi Gandhi, > > > > Thank you for asking. There is no action item for you concerning the > coreference output that you see. However, if you would like to help the > community understand how the module works (input and output), maybe you > could do something like run the pipeline on your original sentence, then > that sentence plus another (before), then that sentence plus another > (after) ... and see how the output changes with the input. If you take > screenshots or something then we could put them on the wiki. Also, would > you mind if the paragraph you sent became one of the example notes in > ctakes? That means that it would be redistributed with the code. > > > > Sean > > > > -----Original Message----- > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > Sent: Tuesday, October 03, 2017 4:26 AM > > To: dev@ctakes.apache.org > > Subject: RE: Enabling drugner pipeline and identifying dates [EXTERNAL] > [SUSPICIOUS] > > > > Hi Tim/Sean, > > > > > > > > Is this an action item on us? If yes, Could someone give us some valid > inputs to test the same? Is someone else going to review this again? > > > > > > > > Regards, > > > > Gandhi > > > > > > > > > > > > -----Original Message----- > > > > From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] > > > > Sent: Monday, October 02, 2017 8:06 PM > > > > To: dev@ctakes.apache.org > > > > Subject: Re: Enabling drugner pipeline and identifying dates [EXTERNAL] > [SUSPICIOUS] > > > > > > > > My bad, I didn't read too closely and thought this was going to be a > coreference patch. I don't know this FSM code that well, so I am not an > expert. My biggest concern at a glance is that these additions help find > more true positives (as in your examples), can we verify that they won't > create false positives? > > > > Tim > > > > > > > > > > > > On Fri, 2017-09-29 at 06:25 +0000, Gandhi Rajan Natarajan wrote: > > > > > Hi Sean, > > > > > > > > > > Thanks again for the response. I guess its mistake from my side that I > > > > > dint send the complete text. Did you mean that with the text I sent, > > > > > the co-reference superscript-1 will be lost? > > > > > > > > > > Also as per your advice, We have created an issue - > https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefen&d=DwIGaQ&c= > qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r= > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m= > sGlpzaOnKKPgjhHkkpfELXpFFGvJtj1Ib-9t3JrGbpQ&s= > STDKsvR9fK6KZuwRjRT3q1gZI8T7ptaKlVWVumKi5dc&e= > > > > > se.proofpoint.com/v2/url?u=https- > > > > > 3A__issues.apache.org_jira_browse_CTAKES- > > > > > 2D459&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup- > > > > > IbsIg9Q1TPOylpP9FE4GTK- > > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh_g > > > > > nqCIxz6hOzUUQ&s=Tihsi1dyNHsqsYbwyClGANfqk2Ov2nfQL2YuIV1L0CI&e= for > > > > > measurement FSM changes and attached the modified file changes. Could > > > > > someone have a look and know your thoughts please? > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > > Sent: Thursday, September 28, 2017 8:21 PM > > > > > To: dev@ctakes.apache.org > > > > > Cc: Miller, Timothy <timothy.mil...@childrens.harvard.edu> > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Gandhi, > > > > > > > > > > I don't recall you sending me that entire snippet of text. I think > > > > > that I only had your single example sentence. > > > > > You have discovered one of the quirks of software: "change the data, > > > > > change the result." > > > > > Ctakes is a system with many moving parts. Things that precede or > > > > > follow your original example sentence will change the evaluation of > > > > > that sentence. > > > > > With the pipeline you are using and the full note, you should see a > > > > > number (mine is 4) next to the first "thalomid" in the original > > > > > example sentence. If you click that number you should see (to the > > > > > right) 4 instances of "thalomid". > > > > > Tim can correct me here, but maybe the coreference module ranked the > > > > > links between "thalomid" as much higher than the rank between "study > > > > > treatment of thalomid 200mg" and "the treatment of hepatocellular > > > > > carcinoma" and discarded the encapsulating treatment texts from > > > > > markables? It is probably more complex than that. > > > > > > > > > > > > > > > > > we have also made some code changes in MeasurementFSM.java to > > > > > > identify certain measurements like '20 mg/m2' which was not > > > > > > identified out of the box. Should we send the code changes to you > > > > > > so that you can consider the same to be productized ? Please > > > > > > advise." > > > > > I don't know if you've noticed the recent emails on the dev list > > > > > involving Alexandru Zbarcea. Alex has been creating or commenting on > > > > > Jira items and attaching code for fixes and enhancements. This is a > > > > > widely used process and is fairly easy to follow. I think that the > > > > > following links are relevant: > > > > > Working with issues: https://urldefense.proofpoint.com/v2/url?u=http > > > > > s-3A__confluence.atlassian.com_jiracoreserver073_working-2Dwith- > > > > > 2Dissues- > > > > > 2D861257307.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe > > > > > FU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK- > > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh_g > > > > > nqCIxz6hOzUUQ&s=Fo-LGlsEfYJpgYcWvrDmor0B3YGxx5brZLelntVMxrU&e= > > > > > Creating patches: https://urldefense.proofpoint.com/v2/url?u=https- > > > > > 3A__confluence.atlassian.com_crucible_creating-2Dpatch-2Dfiles-2Dfor- > > > > > 2Dpre-2Dcommit-2Dreviews- > > > > > 2D298977458.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe > > > > > FU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK- > > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh_g > > > > > nqCIxz6hOzUUQ&s=wVhEQCU73iEplHm34bO2AtgaDUpjAvrFe4GFx5b6pYo&e= > > > > > Attaching files: https://urldefense.proofpoint.com/v2/url?u=https-3 > > > > > A__confluence.atlassian.com_jiracorecloud_attaching-2Dfiles-2Dand- > > > > > 2Dscreenshots-2Dto-2Dissues- > > > > > 2D765593805.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxe > > > > > FU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK- > > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh_g > > > > > nqCIxz6hOzUUQ&s=eO_HZCkkeOg8jF3CMYnMxttXRHSM16qdwPl5nTW48zQ&e= > > > > > > > > > > I don't know if you have a jira account and permissions for the ctakes > > > > > project. An administrator may need to set that up for you. > > > > > > > > > > Thanks, > > > > > Sean > > > > > > > > > > -----Original Message----- > > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > > > > Sent: Thursday, September 28, 2017 4:09 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Sean, > > > > > > > > > > Thanks for the response. I was able to see the co-reference > > > > > superscript using the html file that you sent. Interestingly even I > > > > > was able to generate the sample HTML using piper GUI by having only > > > > > that single line - " The patient started study treatment of Thalomid > > > > > 200mg (days 1-21), and Epirubicin, 20 mg/m2 (days 1, 8, and 15) on > > > > > 06/07/02 for the treatment of hepatocellular carcinoma. " in the input > > > > > file. > > > > > > > > > > But when I change the input file content with the following lines: > > > > > > > > > > "This patient is participating in a Non-IND study; Protocol CG- > > > > > 000424: "Phase I/II of Thalidomide and Epirubicin in Patients with > > > > > Unresectable or Metastatic Hepatocellular Carcinoma".Information has > > > > > been received from the investigator regarding an 82 year-old male > > > > > patient who had gastrointestinal bleeding while on Thalomid, > > > > > Epirubicin, and Coumadin. He had a past medical history of > > > > > diverticulosis in 03/02 and a right atrial clot from intraventricular > > > > > catheter (IVC) for which he was started on Coumadin. During the > > > > > hospitalization for a right atrial clot in 03/02 hepatocellular > > > > > carcinoma was first noted and he was referred to an oncologist. The > > > > > patient started study treatment of Thalomid 200mg (days 1-21), and > > > > > Epirubicin, 20 mg/m2 (days 1, 8, and 15) on 06/07/02 for the treatment > > > > > of hepatocellular carcinoma. He was concomitantly receiving Cardura, > > > > > Ambien (for insomnia), Megace, Coumadin, and Oxycodone. This patient > > > > > presented to the emergency room with the chief complaint of > > > > > hematochezia. He reported noticing bright red blood and small clots > > > > > mixed in with his stool. On 07/13/02, he was admitted due to > > > > > gastrointestinal bleed. The physician ordered 2 large bore > > > > > intravenous lines and planned to transfuse for hematocrit less than > > > > > 30%. Due to the INR (international normalized ratio) level of 3.0, > > > > > Coumadin was held. He was also noted to have bilateral lower extremity > > > > > edema with dyspnea on exertion. On 07/13/02, he had a chest X-ray PA > > > > > and lateral done that showed no evidence of acute pneumonia or > > > > > congestive heart failure. On 07/14/02, he underwent an ultrasound > > > > > which was negative for deep vein thrombosis. This patient did not take > > > > > Thalomid on the day of his admittance to the hospital, but resumed > > > > > treatment shortly after with no return of symptoms. On 07/15/02, he > > > > > was discharged in stable condition. There have been no further reports > > > > > of bleeding at this time. Thedoctor has assessed the hematochezia as > > > > > related to Coumadin treatment and previously diagnosed diverticulosis, > > > > > and not to protocol therapy with Thalomid and Epirubicin.Additional > > > > > information received from the investigator on 27Aug02 reveals that > > > > > this male patient began on 07Jun02 two cycles of therapy with > > > > > Thalidomide and Epirubicin. His post cycle two computed tomography > > > > > scans revealed increase in size of liver lesion with development of > > > > > multiple new satellite nodules. On 29Jul02, the investigator removed > > > > > this patient from protocol for progressive disease and recommended > > > > > hospice care. After seeking a second opinion from two other > > > > > institutions, this patient was admitted to hospice on 05Aug02. On > > > > > 20Aug02, the investigator noted that this patient was suffering > > > > > worsening fatigue and got tired getting out of his chair. On 25Aug02, > > > > > this patient died due to disease progression. The investigator > > > > > assessed the death as not related to study treatment and expected" > > > > > > > > > > The co-reference superscript is lost by then. Did you tried with the > > > > > complete text above by any chance in your piper GUI? Also I guess you > > > > > did not notice the question on my last post - " Sean, we have also > > > > > made some code changes in MeasurementFSM.java to identify certain > > > > > measurements like '20 mg/m2' which was not identified out of the box. > > > > > Should we send the code changes to you so that you can consider the > > > > > same to be productized ? Please advise." > > > > > > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > > Sent: Wednesday, September 27, 2017 5:53 PM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Gandhi, > > > > > > > > > > I am glad that you are feeling better. > > > > > I don't understand why you aren't getting the same output as me. I > > > > > just ran your example sentence with your piper with a fresh checkout > > > > > and get the html below. The css follows. Copy and paste into a file > > > > > and see if you see the corefs. > > > > > > > > > > ///////////////////////////////////////////////////// html, copy into > > > > > file ///////////////////////////////////////////////// > > > > > > > > > > <!DOCTYPE html> > > > > > <html> > > > > > <head> > > > > > <title>OneLiner Output</title> > > > > > </head> > > > > > <body> > > > > > <link rel="stylesheet" href="ctakes.pretty.css" type="text/css" > > > > > media="screen"> <h2>OneLiner</h2> <i>Text processing finished on: 9 > > > > > 27 2017, 08:15:31</i> <hr> > > > > > > > > > > <div id="content"> > > > > > > > > > > <p> > > > > > The patient <span class="AFF_" > > > > > onClick="iaf('AFF_NL_EVTNL_startedNL_SPC_[before] doc timeNL_NL_')" > > > > > TIP="Event ">started</span> study <span class="AFF_" > > > > > onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc > > > > > timeNL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic > > > > > procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure > > > > > ">treatment</span><span class="PRC"><sup>•</sup></span> of <span > > > > > class="AFF_" > > > > > onClick="iaf('AFF_NL_DRGNL_ThalomidNL_SPC_C0723668NL_SPC_[before] doc > > > > > timeNL_NL_')" TIP="Drug ">Thalomid</span><span > > > > > class="DRG"><sup>•</sup></span> <span class="AFF_" > > > > > onClick="iaf('AFF_NL_EVTNL_200mgNL_SPC_[before] doc timeNL_NL_')" > > > > > TIP="Event ">200mg</span><span class="UNK" > > > > > onClick="crf1()"><sup>1</sup></span> ( <span class="GNR_" > > > > > onClick="iaf('GNR_NL_TMXNL_daysNL_NL_')" TIP="Time ">days</span> 1 - > > > > > 21 ) , and <span class="AFF_" > > > > > onClick="iaf('AFF_NL_DRGNL_EpirubicinNL_SPC_C0014582NL_SPC_[before] > > > > > doc timeNL_NL_')" TIP="Drug ">Epirubicin</span><span > > > > > class="DRG"><sup>•</sup></span> , 20 mg / m2 ( <span class="GNR_" > > > > > onClick="iaf('GNR_NL_TMXNL_days 1 , 8NL_NL_')" TIP="Time ">days 1 , > > > > > 8</span> , and 15 ) on <span class="GNR_" > > > > > onClick="iaf('GNR_NL_TMXNL_06 / 07 / 02NL_SPC_[CONTAINS] > > > > > treatmentNL_NL_')" TIP="Time ">06 / 07 / 02</span> for the <span > > > > > class="AFF_" onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc > > > > > timeNL_SPC_06 / 07 / 02 > > > > > [CONTAINS]NL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic > > > > > procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure > > > > > ">treatment</span><span class="PRC"><sup>•</sup></span> of <span > > > > > class="AFF_" onClick="iaf('AFF_NL_DISNL_hepatocellular > > > > > carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc > > > > > timeNL_NL_')" TIP="Disorder ">hepatocellular </span><span class="AFF_" > > > > > onClick="iaf('AFF_NL_DISNL_hepatocellular > > > > > carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc > > > > > timeNL_NL_EVTNL_carcinomaNL_SPC_[before] doc timeNL_NL_')" > > > > > TIP="Disorder Event ">carcinoma</span><span class="DIS" > > > > > onClick="crf1()"><sup>1</sup></span> . > > > > > <br> > > > > > > > > > > </p> > > > > > > > > > > </div> > > > > > > > > > > <div id="ia"> Annotation Information </div> <script > > > > > type="text/javascript"> > > > > > function iaf(txt) { > > > > > var aff=txt.replace( /AFF_/g,"<br><h3>Affirmed</h3>" ); > > > > > var neg=aff.replace( /NEG_/g,"<br><h3>Negated</h3>" ); > > > > > var unc=neg.replace( /UNC_/g,"<br><h3>Uncertain</h3>" ); > > > > > var unn=unc.replace( /UNN_/g,"<br><h3>Uncertain, Negated</h3>" ); > > > > > var ant=unn.replace( /ANT/g,"<b>Anatomical Site</b>" ); > > > > > var dis=ant.replace( /DIS/g,"<b>Disease/ Disorder</b>" ); > > > > > var fnd=dis.replace( /FND/g,"<b>Sign/ Symptom</b>" ); > > > > > var prc=fnd.replace( /PRC/g,"<b>Procedure</b>" ); > > > > > var drg=prc.replace( /DRG/g,"<b>Medication</b>" ); > > > > > var evt=drg.replace( /EVT/g,"<b>Event</b>" ); > > > > > var tmx=evt.replace( /TMX/g,"<b>Time</b>" ); > > > > > var unk=tmx.replace( /UNK/g,"<b>Unknown</b>" ); > > > > > var spc=unk.replace( > > > > > /SPC_/g," " ); > > > > > var prf1=spc.replace( /\[/g,"<i>" ); > > > > > var prf2=prf1.replace( /\]/g,"</i>" ); > > > > > var nl=prf2.replace( /NL_/g,"<br>" ); > > > > > document.getElementById("ia").innerHTML = nl; > > > > > } > > > > > function crf1() { > > > > > document.getElementById("ia").innerHTML = "<br><h3>Coreference > > > > > Chain</h3>study treatment of Thalomid 200mg<br>the treatment of > > > > > hepatocellular carcinoma"; > > > > > } > > > > > </script></body> > > > > > </html> > > > > > > > > > > > > > > > > > > > > ///////////////////////////////////////////////////// css, copy into > > > > > file named ctakes.pretty.css in same directory as html > > > > > ///////////////////////////////////////////////// > > > > > > > > > > > > > > > > > > > > .GNR_ { > > > > > position: relative; > > > > > display: inline-block gray; > > > > > border-bottom: 0.10em solid gray; > > > > > } > > > > > > > > > > .AFF_ { > > > > > position: relative; > > > > > display: inline-block green; > > > > > border-bottom: 0.15em solid green; > > > > > } > > > > > > > > > > .UNC_ { > > > > > position: relative; > > > > > display: inline-block gold; > > > > > border-bottom: 0.16em dotted gold; > > > > > } > > > > > > > > > > .NEG_ { > > > > > position: relative; > > > > > display: inline-block red; > > > > > border-bottom: 0.16em dashed red; > > > > > } > > > > > > > > > > .UNN_ { > > > > > position: relative; > > > > > display: inline-block orange; > > > > > border-bottom: 0.16em dashed orange; } > > > > > > > > > > .FND { > > > > > color: magenta; > > > > > } > > > > > > > > > > .DIS { > > > > > color: black; > > > > > } > > > > > > > > > > .DRG { > > > > > color: red; > > > > > } > > > > > > > > > > .PRC { > > > > > color: blue; > > > > > } > > > > > > > > > > .ANT { > > > > > color: gray; > > > > > } > > > > > > > > > > .UNK { > > > > > color: gray; > > > > > } > > > > > > > > > > [TIP] { > > > > > position: relative; > > > > > z-index: 2; > > > > > cursor: pointer; > > > > > } > > > > > [TIP]::before, > > > > > [TIP]::after { > > > > > visibility: hidden; > > > > > -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=0)"; > > > > > filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=0); > > > > > opacity: 0; > > > > > pointer-events: none; > > > > > } > > > > > [TIP]::before { > > > > > position: absolute; > > > > > bottom: 0%; > > > > > left: 100%; > > > > > margin-bottom: 5px; > > > > > padding: 7px; > > > > > -webkit-border-radius: 3px; > > > > > -moz-border-radius: 3px; > > > > > border-radius: 3px; > > > > > background-color: #000; > > > > > background-color: hsla(0, 0%, 20%, 0.9); > > > > > color: #fff; > > > > > content: attr(TIP); > > > > > text-align: center; > > > > > font-size: 14px; > > > > > line-height: 1.2; > > > > > } > > > > > [TIP]:hover::before, > > > > > [TIP]:hover::after { > > > > > visibility: visible; > > > > > -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=100)"; > > > > > filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=100); > > > > > opacity: 1; > > > > > } > > > > > > > > > > div#ia { > > > > > position: fixed; > > > > > top: 0; > > > > > right: 0; > > > > > width: 20%; > > > > > height: 100%; > > > > > padding: 10px; > > > > > overflow: auto; > > > > > background-color: lightgray; > > > > > } > > > > > > > > > > div#content { > > > > > width: 79%; > > > > > height: 100%; > > > > > padding: 10px; > > > > > overflow: auto; > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > > > > Sent: Wednesday, September 27, 2017 4:40 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Sean, > > > > > > > > > > Sorry for the delayed response as I was out of office due to illness. > > > > > If I don't add BackwardsTimeAnnotator, I don't see any error related > > > > > to isTraining param. But still couldn't get the superscript co- > > > > > reference working. Please note that I am using the latest 4.0.1 jars. > > > > > The piper file and console log messages are as follows: > > > > > > > > > > PIPER FILE: > > > > > // Advanced Tokenization: Regex sectionization, BIO Sentence Detector > > > > > (lumper), Paragraphs,Lists load AdvancedTokenizerPipeline.piper add > > > > > ContextDependentTokenizerAnnotator > > > > > add POSTagger > > > > > // Chunkers > > > > > load ChunkerSubPipe.piper > > > > > // Default fast dictionary lookup > > > > > load DictionarySubPipe.piper > > > > > add org.apache.ctakes.drugner.ae.DrugMentionAnnotator > > > > > // Cleartk Entity Attributes > > > > > load AttributeCleartkSubPipe.piper > > > > > // Relations > > > > > load RelationSubPipe.piper > > > > > // Temporal > > > > > load TemporalSubPipe.piper > > > > > // Coreferences > > > > > load CorefSubPipe.piper > > > > > //add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator > > > > > // Html output > > > > > add pretty.html.HtmlTextWriter > > > > > // XMl writer > > > > > add FileTreeXmiWriter > > > > > > > > > > CONSOLE LOG: > > > > > > > > > > 22 Sep 2017 13:59:44 INFO ClearNLPSemanticRoleLabelerAE - Finished > > > > > initializing > > > > > 22 Sep 2017 13:59:44 INFO CleartkAnalysisEngine - Starting > > > > > initializing for Assigning Attributes > > > > > 22 Sep 2017 13:59:46 INFO CleartkAnalysisEngine - Finished > > > > > initializing > > > > > 22 Sep 2017 13:59:46 INFO ModifierExtractorAnnotator - Starting > > > > > initializing > > > > > 22 Sep 2017 13:59:46 INFO ModifierExtractorAnnotator - Finished > > > > > initializing > > > > > 22 Sep 2017 13:59:46 INFO DegreeOfRelationExtractorAnnotator - > > > > > Starting initializing > > > > > 22 Sep 2017 13:59:46 INFO DegreeOfRelationExtractorAnnotator - > > > > > Finished initializing > > > > > 22 Sep 2017 13:59:46 INFO LocationOfRelationExtractorAnnotator - > > > > > Starting initializing > > > > > 22 Sep 2017 13:59:46 INFO LocationOfRelationExtractorAnnotator - > > > > > Finished initializing > > > > > 22 Sep 2017 13:59:46 INFO BackwardsTimeAnnotator - Starting > > > > > initializing > > > > > 22 Sep 2017 13:59:46 INFO BackwardsTimeAnnotator - Finished > > > > > initializing > > > > > 22 Sep 2017 13:59:46 INFO DocTimeRelAnnotator - Starting initializing > > > > > 22 Sep 2017 13:59:48 INFO DocTimeRelAnnotator - Finished initializing > > > > > 22 Sep 2017 13:59:48 INFO EventTimeRelationAnnotator - Starting > > > > > initializing > > > > > 22 Sep 2017 13:59:49 INFO EventTimeRelationAnnotator - Finished > > > > > initializing > > > > > 22 Sep 2017 13:59:49 INFO EventEventRelationAnnotator - Starting > > > > > initializing > > > > > 22 Sep 2017 13:59:51 INFO EventEventRelationAnnotator - Finished > > > > > initializing > > > > > 22 Sep 2017 13:59:51 INFO ConstituencyParser - Initializing parser... > > > > > 22 Sep 2017 13:59:54 INFO RegexSectionizer - Annotating Sections ... > > > > > 22 Sep 2017 13:59:55 INFO RegexSectionizer - Finished processing > > > > > 22 Sep 2017 13:59:55 INFO SentenceDetectorAnnotatorBIO - Starting > > > > > processing ... > > > > > 22 Sep 2017 13:59:55 INFO SentenceDetectorAnnotatorBIO - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:55 INFO ParagraphAnnotator - Annotating Paragraphs > > > > > ... > > > > > 22 Sep 2017 13:59:55 INFO ParagraphAnnotator - Finished processing > > > > > 22 Sep 2017 13:59:55 INFO ParagraphSentenceFixer - Adjusting > > > > > Sentences overlapping Paragraphs ... > > > > > 22 Sep 2017 13:59:55 INFO ParagraphSentenceFixer - Finished > > > > > Processing > > > > > 22 Sep 2017 13:59:55 INFO ListAnnotator - Annotating Lists ... > > > > > 22 Sep 2017 13:59:55 INFO ListAnnotator - Finished processing > > > > > 22 Sep 2017 13:59:55 INFO ListSentenceFixer - Adjusting Sentences > > > > > overlapping Lists ... > > > > > 22 Sep 2017 13:59:55 INFO ListSentenceFixer - Finished Processing > > > > > 22 Sep 2017 13:59:55 INFO TokenizerAnnotatorPTB - process(JCas) in > > > > > org.apache.ctakes.core.ae.TokenizerAnnotatorPTB > > > > > 22 Sep 2017 13:59:55 INFO ContextDependentTokenizerAnnotator - > > > > > process(JCas) > > > > > 22 Sep 2017 13:59:55 INFO POSTagger - process(JCas) > > > > > 22 Sep 2017 13:59:55 INFO Chunker - process(JCas) > > > > > 22 Sep 2017 13:59:55 INFO ChunkAdjuster - process(JCas) > > > > > 22 Sep 2017 13:59:55 INFO ChunkAdjuster - process(JCas) > > > > > 22 Sep 2017 13:59:55 INFO AbstractJCasTermAnnotator - Finding Named > > > > > Entities ... > > > > > 22 Sep 2017 13:59:55 INFO AbstractJCasTermAnnotator - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - process dev (JCas) > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:56 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:56 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:56 INFO DrugMentionAnnotator - -1 > > > > > 22 Sep 2017 13:59:56 INFO ClearNLPDependencyParserAE - Dependency > > > > > parser starting with thread:pool-2-thread-1 > > > > > 22 Sep 2017 13:59:56 INFO ClearNLPDependencyParserAE - Dependency > > > > > parser ending with thread:pool-2-thread-1 > > > > > 22 Sep 2017 13:59:56 INFO ClearNLPSemanticRoleLabelerAE - Starting > > > > > processing ... > > > > > 22 Sep 2017 13:59:56 INFO ClearNLPSemanticRoleLabelerAE - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:56 INFO CleartkAnalysisEngine - Assigning > > > > > Attributes ... > > > > > 22 Sep 2017 13:59:56 INFO CleartkAnalysisEngine - Finished Assigning > > > > > Attributes > > > > > 22 Sep 2017 13:59:56 INFO ModifierExtractorAnnotator - Starting > > > > > processing ... > > > > > 22 Sep 2017 13:59:56 INFO ModifierExtractorAnnotator - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:56 INFO DegreeOfRelationExtractorAnnotator - > > > > > Starting processing ... > > > > > 22 Sep 2017 13:59:56 INFO DegreeOfRelationExtractorAnnotator - > > > > > Finished processing > > > > > 22 Sep 2017 13:59:56 INFO LocationOfRelationExtractorAnnotator - > > > > > Starting processing ... > > > > > 22 Sep 2017 13:59:57 INFO LocationOfRelationExtractorAnnotator - > > > > > Finished processing > > > > > 22 Sep 2017 13:59:57 INFO BackwardsTimeAnnotator - Starting > > > > > processing ... > > > > > 22 Sep 2017 13:59:57 INFO BackwardsTimeAnnotator - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:57 INFO DocTimeRelAnnotator - Starting processing > > > > > ... > > > > > 22 Sep 2017 13:59:58 INFO DocTimeRelAnnotator - Finished processing > > > > > 22 Sep 2017 13:59:58 INFO EventTimeRelationAnnotator - Starting > > > > > processing ... > > > > > 22 Sep 2017 13:59:59 INFO EventTimeRelationAnnotator - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:59 INFO EventEventRelationAnnotator - Starting > > > > > processing ... > > > > > 22 Sep 2017 13:59:59 INFO EventEventRelationAnnotator - Finished > > > > > processing > > > > > 22 Sep 2017 13:59:59 INFO MaxentParserWrapper - Started processing: > > > > > test > > > > > 22 Sep 2017 14:00:02 INFO MaxentParserWrapper - Done parsing: test > > > > > 22 Sep 2017 14:00:03 INFO MentionClusterCoreferenceAnnotator - > > > > > Finding Coreferences ... > > > > > 22 Sep 2017 14:00:03 INFO MentionClusterCoreferenceAnnotator - > > > > > Finished. > > > > > 22 Sep 2017 14:00:03 INFO HtmlTextWriter - Writing HTML to > > > > > D:\Gandhi\ArisG\cTAKES\apache-ctakes- > > > > > 4.0.0\bin_old\test_output\test.txt.pretty.html ... > > > > > 22 Sep 2017 14:00:03 INFO HtmlTextWriter - Finished Writing > > > > > 22 Sep 2017 14:00:03 INFO FileTreeXmiWriter - Writing XMI to > > > > > D:\Gandhi\ArisG\cTAKES\apache-ctakes- > > > > > 4.0.0\bin_old\test_output\test.txt.xmi ... > > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > > decreasingWithTrace(51) > > > > > WARNING: Message count: 1; Feature > > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > > references. These will be serialized in duplicate. Message count > > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > > logging level for stacktrace. > > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > > decreasingWithTrace(51) > > > > > WARNING: Message count: 2; Feature > > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > > references. These will be serialized in duplicate. Message count > > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > > logging level for stacktrace. > > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > > decreasingWithTrace(51) > > > > > WARNING: Message count: 4; Feature > > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > > references. These will be serialized in duplicate. Message count > > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > > logging level for stacktrace. > > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > > decreasingWithTrace(51) > > > > > WARNING: Message count: 8; Feature > > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > > references. These will be serialized in duplicate. Message count > > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > > logging level for stacktrace. > > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > > decreasingWithTrace(51) > > > > > WARNING: Message count: 16; Feature > > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > > references. These will be serialized in duplicate. Message count > > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > > logging level for stacktrace. > > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > > decreasingWithTrace(51) > > > > > WARNING: Message count: 32; Feature > > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > > references. These will be serialized in duplicate. Message count > > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > > logging level for stacktrace. > > > > > 22 Sep 2017 14:00:03 INFO FileTreeXmiWriter - Finished Writing > > > > > > > > > > > > > > > Sean, we have also made some code changes in MeasurementFSM.java to > > > > > identify certain measurements like '20 mg/m2' which was not identified > > > > > out of the box. Should we send the code changes to you so that you > > > > > can consider the same to be productized ? Please advise. > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > > Sent: Friday, September 22, 2017 6:54 PM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Gandhi, > > > > > > > > > > You don't need to add BackwardsTimeAnnotator to your piper. It is > > > > > added by the TemporalSubPipe.piper. The error that you are seeing > > > > > regarding training is very strange, but you can try adding this line > > > > > to the top of the file: > > > > > set isTraining=false > > > > > > > > > > Can you run a sample file with your piper and send me the log > > > > > statements? It might help me figure out what is going on. > > > > > > > > > > > > > > > > > is there any doc or guide on how to start writing our own annotator. > > > > > There are two example annotators in the ctakes-examples project under > > > > > the ae/ directory. You can look at those, but I recommend that you > > > > > look at some information on Uimafit, which can be used to create new > > > > > annotators: > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__uima.apache.org_ > > > > > d_uimafit- > > > > > 2D2.1.0_tools.uimafit.book.pdf&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW1 > > > > > 4JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=OlZ5 > > > > > SUTgU94HjHE8vZDkXv8hjaaa9qEpAlfZjU52Ymk&s=0rIPMY5osSxL4J9gMymmv0bHsBX > > > > > imd0yb1FmUp4uT-A&e= > > > > > An introduction to creating Analysis Engines (Annotators) is on page > > > > > 5. > > > > > > > > > > Coding style is individualistic, but below is a rubberstamp that I use > > > > > to get started: > > > > > > > > > > import org.apache.ctakes.core.pipeline.PipeBitInfo; > > > > > import org.apache.log4j.Logger; > > > > > import org.apache.uima.UimaContext; > > > > > import > > > > > org.apache.uima.analysis_engine.AnalysisEngineProcessException; > > > > > import org.apache.uima.fit.component.JCasAnnotator_ImplBase; > > > > > import org.apache.uima.jcas.JCas; > > > > > import org.apache.uima.resource.ResourceInitializationException; > > > > > > > > > > /** > > > > > * @author SPF , chip-nlp > > > > > * @version %I% > > > > > * @since 9/22/2017 > > > > > */ > > > > > @PipeBitInfo( > > > > > name = "Template", > > > > > description = "For Example.", role = PipeBitInfo.Role.ANNOTATOR > > > > > ) > > > > > final public class Template extends JCasAnnotator_ImplBase { > > > > > > > > > > static private final Logger LOGGER = Logger.getLogger( "Template" > > > > > ); > > > > > > > > > > /** > > > > > * {@inheritDoc} > > > > > */ > > > > > @Override > > > > > public void initialize( final UimaContext context ) throws > > > > > ResourceInitializationException { > > > > > // Always call the super first > > > > > super.initialize( context ); > > > > > // place AE initialization code here > > > > > } > > > > > > > > > > /** > > > > > * {@inheritDoc} > > > > > */ > > > > > @Override > > > > > public void process( final JCas jCas ) throws > > > > > AnalysisEngineProcessException { > > > > > LOGGER.info( "Processing ..." ); > > > > > // Place AE processing code here > > > > > LOGGER.info( "Finished." ); > > > > > } > > > > > } > > > > > > > > > > > > > > > > > > > > If you use IntelliJ as your ide you can create a file template with > > > > > these parameters: > > > > > > > > > > #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")package > > > > > ${PACKAGE_NAME};#end > > > > > > > > > > import org.apache.ctakes.core.pipeline.PipeBitInfo; > > > > > import org.apache.log4j.Logger; > > > > > import org.apache.uima.UimaContext; > > > > > import > > > > > org.apache.uima.analysis_engine.AnalysisEngineProcessException; > > > > > import org.apache.uima.fit.component.JCasAnnotator_ImplBase; > > > > > import org.apache.uima.jcas.JCas; > > > > > import org.apache.uima.resource.ResourceInitializationException; > > > > > > > > > > #parse("File Header.java") > > > > > @PipeBitInfo( > > > > > name = "${NAME}", > > > > > #if ( ${PROJECT_NAME} != "")description = "For > > > > > ${PROJECT_NAME}.",#end > > > > > role = PipeBitInfo.Role.ANNOTATOR > > > > > ) > > > > > final public class ${NAME} extends JCasAnnotator_ImplBase { > > > > > > > > > > static private final Logger LOGGER = Logger.getLogger( "${NAME}" > > > > > ); > > > > > > > > > > /** > > > > > * {@inheritDoc} > > > > > */ > > > > > @Override > > > > > public void initialize( final UimaContext context ) throws > > > > > ResourceInitializationException { > > > > > // Always call the super first > > > > > super.initialize( context ); > > > > > // place AE initialization code here > > > > > } > > > > > > > > > > /** > > > > > * {@inheritDoc} > > > > > */ > > > > > @Override > > > > > public void process( final JCas jCas ) throws > > > > > AnalysisEngineProcessException { > > > > > LOGGER.info( "Processing ..." ); > > > > > // Place AE processing code here > > > > > LOGGER.info( "Finished." ); > > > > > } > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > > > > Sent: Friday, September 22, 2017 2:23 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Sean, > > > > > > > > > > Thanks again for the detailed response. > > > > > > > > > > I still couldn't manage to get superscript-1 co-reference in piper > > > > > GUI. Also I'm not able to use "BackwardsTimeAnnotator" in piper GUI > > > > > as it gives me the below error: > > > > > > > > > > org.apache.uima.resource.ResourceInitializationException: > > > > > Initialization of annotator class > > > > > "org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator" > > > > > failed. (Descriptor: <unknown>) > > > > > at > > > > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini > > > > > tializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:271) > > > > > at > > > > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini > > > > > tialize(PrimitiveAnalysisEngine_impl.java:170) > > > > > Caused by: java.lang.IllegalArgumentException: Please specify > > > > > PARAM_IS_TRAINING - unable to infer it from context > > > > > at > > > > > org.cleartk.ml.CleartkAnnotator.initialize(CleartkAnnotator.java:109) > > > > > > > > > > Somewhere in old mails it's mentioned that it's because of missing > > > > > dependencies so I tried adding ClearTkAnnotator with no luck yet. My > > > > > piper file is as follows: > > > > > > > > > > load AdvancedTokenizerPipeline.piper > > > > > add ContextDependentTokenizerAnnotator > > > > > add POSTagger > > > > > load ChunkerSubPipe.piper > > > > > load DictionarySubPipe.piper > > > > > add org.apache.ctakes.drugner.ae.DrugMentionAnnotator > > > > > load AttributeCleartkSubPipe.piper > > > > > load RelationSubPipe.piper > > > > > load TemporalSubPipe.piper > > > > > load CorefSubPipe.piper > > > > > add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator > > > > > add pretty.html.HtmlTextWriter > > > > > add FileTreeXmiWriter > > > > > > > > > > Any suggestion on this? Also I'm using all the latest 4.0.1 cTAKES > > > > > Jars. Regarding the identification of Names, will dig deep on what you > > > > > have mentioned. > > > > > > > > > > Sorry to ask this as you already mentioned that there are no detailed > > > > > docs for cTAKES. But is there any doc or guide on how to start writing > > > > > our own annotator if required? It not, Is there any simple annotator > > > > > that you would suggest us to look into to get better understanding on > > > > > annotators for us to proceed further. Thanks in advance. > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > > Sent: Thursday, September 21, 2017 7:59 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Gandhi, > > > > > > > > > > > > > > > > > We guess we are missing out on something as we could not find co- > > > > > > references for "200mg". Should we add anymore piper for this? > > > > > The piper commands that I sent has everything to obtain coreferences. > > > > > I use it regularly - it is what I used on your example sentence to get > > > > > the coreferences that I mentioned. > > > > > > > > > > > > > > > > > Also the change mentioned in the thread ... > > > > > That is a very old thread and I don't think that it applies to what > > > > > you are trying to do. > > > > > > > > > > > > > > > > > We also have a requirement to identify the patient names and sex > > > > > As James said, ctakes isn't really meant to do this. Ctakes is > > > > > catered toward extracting clinical data, and to this point names have > > > > > not fallen into that category. It is more a task for general nlp. > > > > > There is an opennlp model that can identify names and a few others (I > > > > > used to see names using GATE). ctakes has wrapped opennlp for other > > > > > tasks and you should be able to do the same to adapt an engine for > > > > > names into ctakes. > > > > > > > > > > > > > > > > > cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 or > > > > > > 06 > > > > > > / 07 / 02 or 27Aug2002 > > > > > As Chen mentioned, the BackwardTimeAnnotator module uses an ML model > > > > > trained on gold data. It isn't perfect. You can add another time > > > > > annotator on top of this to get some of the more simply formatted date > > > > > mentions - there are a lot of them out there. Personally I have used > > > > > jchronic as it can be easily tweaked to recognize medically- relevant > > > > > temporal expressions relating to surgery, pharmacology, etc. > > > > > > > > > > Sean > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > > Sent: Wednesday, September 20, 2017 8:50 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > > > Hi Gandhi, > > > > > > > > > > I don't have time to go through all of this right now, but I will try > > > > > to get to it soon. > > > > > > > > > > Make sure that you are running the latest version in trunk. > > > > > > > > > > Sean > > > > > > > > > > -----Original Message----- > > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > > > > Sent: Wednesday, September 20, 2017 7:03 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] > > > > > > > > > > Hi, Could someone help me out on the below queries please? > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > -----Original Message----- > > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > > > > Sent: Tuesday, September 19, 2017 8:51 PM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] > > > > > > > > > > Hi Sean, > > > > > > > > > > Thanks again for the detailed and prompt response. We were able to run > > > > > the piper GUI as per your advice. But in the output (The patient > > > > > started study treatment of Thalomid 200mg ( days 1 - 21 ) , and > > > > > Epirubicin ,20 mg / m2 ( days 1 , 8 , and 15 ) on 06 / 07 / 02 for the > > > > > treatment of hepatocellular carcinoma.), we were not able to find > > > > > superscript-1 as you mentioned earlier but could find superscript-2, > > > > > 3 etc. We guess we are missing out on something as we could not find > > > > > co-references for "200mg". Should we add anymore piper for this? > > > > > > > > > > Also the change mentioned in the thread - https://urldefense.proofpoint. > com/v2/url?u=https-3A__urldefense.proofpoi&d=DwIGaQ&c=qS4goWBT7poplM69zy_ > 3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m= > sGlpzaOnKKPgjhHkkpfELXpFFGvJtj1Ib-9t3JrGbpQ&s= > Z2KDsVD0pIlvt8WeDz3EYT5zPXaYFfRgH88z3hmXcSM&e= > > > > > nt.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod- > > > > > 5Fmbox_ctakes-2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1- > > > > > 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com- > > > > > 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gvl > > > > > GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2LEB > > > > > LyOfXR3ikwOL0&s=GzhvIkBu4cgyzYN9n6VLe2rz4sJhJzMxDcWyB0BkqAc&e= is > > > > > required for the drug-ner module to identify drug-ner annotations. > > > > > > > > > > 1) We also have a requirement to identify the patient names and sex > > > > > available in narrative texts. Please let us know how to achieve the > > > > > same as its not identifying the proper nouns and the relationship with > > > > > the patient? > > > > > Eg. "This male patient named Tom Hardy aged 35 years is participating > > > > > in a Non-IND study" > > > > > > > > > > 2) cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 or > > > > > 06 / 07 / 02 or 27Aug2002 as in the below example. Please let us know > > > > > how to enhance the system to identify such date patterns. > > > > > E.g " On 20Aug02, the investigator noted that this patient was > > > > > suffering worsening fatigue and got tired getting out of his chair" > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > > Sent: Monday, September 18, 2017 10:02 PM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] > > > > > > > > > > Hi Gandhi, > > > > > > > > > > > > > > > > > So in this case will be able to see drug attributes in the output > > > > > > XML? > > > > > As long as you have the DrugMentionAnnotator in your pipeline you > > > > > should be able to find drug attributes in the xml output file. > > > > > > > > > > > > > > > > > we also saw some code changes needs to be done to use drug-ner > > > > > > module. Is it still valid? > > > > > As far as I know there aren't any necessary code changes to get drug > > > > > ner running. However, I do not normally use drugner so I can't say > > > > > for certain. > > > > > > > > > > > > > > > > > Also you mentioned that the drun-ner module is out of date > > > > > It can still be used and will produce annotations. All that I meant > > > > > was that there may not be many people out there using it. It is not > > > > > part of the default pipeline. > > > > > > > > > > > You also mentioned that when you run the sentence, the date was > > > > > identified. Where and how exactly did you ran it so that we can check > > > > > the same? > > > > > I run the following in a piper file because I am interested in a lot > > > > > of modules (I added drugner just for you): > > > > > > > > > > // Advanced Tokenization: Regex sectionization, BIO Sentence Detector > > > > > (lumper), Paragraphs, Lists load AdvancedTokenizerPipeline.piper add > > > > > ContextDependentTokenizerAnnotator > > > > > add POSTagger > > > > > // Chunkers > > > > > load ChunkerSubPipe.piper > > > > > // Default fast dictionary lookup > > > > > load DictionarySubPipe.piper > > > > > add org.apache.ctakes.drugner.ae.DrugMentionAnnotator > > > > > // Cleartk Entity Attributes > > > > > load AttributeCleartkSubPipe.piper > > > > > // Relations > > > > > load RelationSubPipe.piper > > > > > // Temporal > > > > > load TemporalSubPipe.piper > > > > > // Coreferences > > > > > load CorefSubPipe.piper > > > > > // Html output > > > > > add pretty.html.HtmlTextWriter > > > > > > > > > > For information on piper files, see https://urldefense.proofpoint.com > > > > > /v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper- > > > > > 2BFiles&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67 > > > > > GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2 > > > > > LEBLyOfXR3ikwOL0&s=9ueuHYwEywok8byBXEkVjmTWiChmaIY3ryB4Pi6ajRo&e= > > > > > I run it in my IDE with: > > > > > org.apache.ctakes.core.pipeline.PiperFileRunner -Xmx3G -p > > > > > <FileAsAbove>.piper -i org/apache/ctakes/examples/notes -o <OutputDir> > > > > > --user <MyUmlsUser> --pass <MyUmlsPass> You can run it by command line > > > > > by substituting "org.apache.ctakes.core.pipeline.PiperFileRunner > > > > > -Xmx3G" with "bin/runPiperFile". > > > > > You can also run it through a ctakes 4.01 (trunk) gui. See > https://urldefense.proofpoint.com/v2/url?u=https-3A__u&d= > DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r= > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m= > sGlpzaOnKKPgjhHkkpfELXpFFGvJtj1Ib-9t3JrGbpQ&s=De1FflS0YPayXeuQwuIZeE_ > JCneTxw_0q8vaEV2VK18&e= > > > > > rldefense.proofpoint.com/v2/url?u=https- > > > > > 3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile- > > > > > 2BSubmitter- > > > > > 2BGUI&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv > > > > > lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2LE > > > > > BLyOfXR3ikwOL0&s=VWIrXrfA2dZ8KHOdoizJo-nTx7nPSy4GDOZ7IxQteIQ&e= > > > > > > > > > > > > > > > > > I'm not able to see any clickable option in HTML output > > > > > You must have the HtmlTextWriter at the end of your pipeline to > > > > > produce html files. To keep the xml file output, place "add > > > > > FileTreeXmiWriter" at the end of the piper. > > > > > > > > > > > > > > > > > Apologizes for too many > > > > > No worries, we are happy to have your interest! > > > > > > > > > > Sean > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > > > > Sent: Saturday, September 16, 2017 7:01 AM > > > > > To: dev@ctakes.apache.org > > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > > [EXTERNAL] > > > > > > > > > > Hi Sean, > > > > > > > > > > Thanks again for the prompt response. Appreciate your input on adding > > > > > DrugMentionAnnotator. Actually, we are relying on pretty printer > > > > > output just to understand the analysis. Our logic to extract disorders > > > > > and findings are based on the XML file generated by > https://urldefense.proofpoint.com/v2/url?u=https-3A_&d= > DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r= > fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m= > sGlpzaOnKKPgjhHkkpfELXpFFGvJtj1Ib-9t3JrGbpQ&s=sg8eEWeiSQD_ > 5CNqtTHC0km2taSkwZgN9C1j33RcLTg&e= > > > > > /urldefense.proofpoint.com/v2/url?u=https- > > > > > 3A__github.com_healthnlp_examples_blob_master_ctakes-2Dtemporal- > > > > > 2Ddemo_src_main_java_org_apache_ctakes_web_client_servlet_DemoServlet > > > > > .java&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv > > > > > lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o- > > > > > BKBn7UcbfF660CEBI&s=g8UzBHRoOyn1hoRABKSC6EtPMvwOSSggviRmWCHKti4&e= > > > > > So in this case will be able to see drug attributes in the output XML? > > > > > > > > > > In one of the old post (https://urldefense.proofpoint.com/v2/url?u=ht > > > > > tp-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes- > > > > > 2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1- > > > > > 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com- > > > > > 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gvl > > > > > GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o- > > > > > BKBn7UcbfF660CEBI&s=iT_1UGR98APO80UaZsaCBHseMqF4M4PfItgokD27r5c&e= ) > > > > > we also saw some code changes needs to be done to use drug-ner module. > > > > > Is it still valid? Also you mentioned that the drun-ner module is out > > > > > of date which means it cannot be used or it may not provide accurate > > > > > analysis? Also what changes needs to be done to bring it up to date so > > > > > that we can try the same if you can assist? > > > > > > > > > > You also mentioned that when you run the sentence, the date was > > > > > identified. Where and how exactly did you ran it so that we can check > > > > > the same? Also regarding you explanation on corefernce, I'm not able > > > > > to see any clickable option in HTML output. So wanted to understand > > > > > how can we run and check that too. > > > > > > > > > > Apologizes for too many questions as we are just a week old in NLP and > > > > > cTAKES. Thanks in advance. > > > > > > > > > > Regards, > > > > > Gandhi > > > > > > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > > > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > > > > > > > This email and any files transmitted with it are confidential and > > > > > intended solely for the use of the individual or entity to whom they > > > > > are addressed. If you are not the named addressee you should not > > > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > > > or system manager by email immediately if you have received this e- > > > > > mail by mistake and delete this e-mail from your system. If you are > > > > > not the intended recipient you are notified that disclosing, copying, > > > > > distributing or taking any action in reliance on the contents of this > > > > > information is strictly prohibited and against the law. > > > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you are not the named addressee you should not disseminate, distribute > or copy this e-mail. Please notify the sender or system manager by email > immediately if you have received this e-mail by mistake and delete this > e-mail from your system. If you are not the intended recipient you are > notified that disclosing, copying, distributing or taking any action in > reliance on the contents of this information is strictly prohibited and > against the law. > > > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you are not the named addressee you should not disseminate, distribute > or copy this e-mail. Please notify the sender or system manager by email > immediately if you have received this e-mail by mistake and delete this > e-mail from your system. If you are not the intended recipient you are > notified that disclosing, copying, distributing or taking any action in > reliance on the contents of this information is strictly prohibited and > against the law. > >