Here's the most recent publication, which describes the system in ctakes 4.0 and later: http://www.sciencedirect.com/science/article/pii/S1532046417300850 Tim
On Tue, 2017-10-03 at 13:52 +0000, Finan, Sean wrote: > > > > With the changes in Input, the co-reference between all the > > entities should still be preserved right? > No. One of the experts can better explain this, but the coreference > module works with "best match" chains. With one sentence of text, > term (Markable) A may have a best match with term B. As soon as you > add more text, you introduce the possibility that term A will have a > better best match with C and/or D, and the previous match to B will > be deemed less accurate and dropped. > In your case the coreference A - B seems to be lost in favor of one > using internal term A', and that is a little strange. It could be > that overlapping markables are being discarded? I will try to look > into this really quickly. > > You can look at some publications on coref if you search the > web. The one that probably best applies to the current coref module > (Tim, Dima, is this true?) is > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.aclweb.org_a > nthology_W12- > 2D2409&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup- > IbsIg9Q1TPOylpP9FE4GTK- > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=ceLOeKc31GMcMXRVqM_QfDAoSqTWnl > HbNcMy1vdWWTE&s=_CKDY58PHb_DWnHgx72vKozAAas7qI9k72hwfHU8Cik&e= > > Sean > > -----Original Message----- > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.com] > > Sent: Tuesday, October 03, 2017 4:18 AM > To: dev@ctakes.apache.org > Subject: RE: Enabling drugner pipeline and identifying dates > [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] > > Hi Sean, I still have some doubts on this. If I run the piper file > with the complete text I sent earlier, I could see only superscript - > 4 for Thalomid and the co-reference of this to "treatment of > hepatocellular carcinoma" is still lost. Also I don’t see any > superscript with number-1 too. With the changes in Input, the co- > reference between all the entities should still be preserved right? > Do we have any more info or doc on this co-reference module to > understand its complexity better? > > Regards, > Gandhi > > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Monday, October 02, 2017 8:36 PM > To: dev@ctakes.apache.org > Subject: RE: Enabling drugner pipeline and identifying dates > [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] > > Hi Tim, > > The coreference question (just a question) was for a different item > altogether. Sorry for any confusion. The reason that I CC:d you ... > > From Gandhi: > > > > Interestingly even I was able to generate [Sean's coref output] > > using piper GUI by having only that single line - " The patient > > started study treatment of Thalomid 200mg (days 1-21), and > > Epirubicin, 20 mg/m2 (days 1, 8, and 15) on 06/07/02 for the > > treatment of hepatocellular carcinoma. " in the input file. > > But when I change the input file content with the following > > lines: [Full paragraph (below), single-sentence in middle] The > > co-reference superscript is lost by then. > Sean's answer: > > > > Ctakes is a system with many moving parts. Things that precede or > > follow your original example sentence will change the evaluation of > > that sentence. > With the pipeline you are using and the full note, you should see a > number (mine is 4) next to the first "thalomid" in the original > example sentence. If you click that number you should see (to the > right) 4 instances of "thalomid". > > > > Tim can correct me here, but maybe the coreference module ranked > > the links between "thalomid" as much higher than the rank between > > "study treatment of thalomid 200mg" and "the treatment of > > hepatocellular carcinoma" and discarded the encapsulating treatment > > texts from markables? It is probably more complex than that. > Sean > > "This patient is participating in a Non-IND study; Protocol CG- > 000424: "Phase I/II of Thalidomide and Epirubicin in Patients with > Unresectable or Metastatic Hepatocellular Carcinoma".Information has > been received from the investigator regarding an 82 year-old male > patient who had gastrointestinal bleeding while on Thalomid, > Epirubicin, and Coumadin. He had a past medical history of > diverticulosis in 03/02 and a right atrial clot from intraventricular > catheter (IVC) for which he was started on Coumadin. During the > hospitalization for a right atrial clot in 03/02 hepatocellular > carcinoma was first noted and he was referred to an oncologist. The > patient started study treatment of Thalomid 200mg (days 1-21), and > Epirubicin, 20 mg/m2 (days 1, 8, and 15) on 06/07/02 for the > treatment of hepatocellular carcinoma. He was concomitantly > receiving Cardura, Ambien (for insomnia), Megace, Coumadin, and > Oxycodone. This patient presented to the emergency room with the > chief complaint of hematochezia. He reported noticing bright red > blood and small clots mixed in with his stool. On 07/13/02, he was > admitted due to gastrointestinal bleed. The physician ordered 2 > large bore intravenous lines and planned to transfuse for hematocrit > less than 30%. Due to the INR (international normalized ratio) level > of 3.0, Coumadin was held. He was also noted to have bilateral lower > extremity edema with dyspnea on exertion. On 07/13/02, he had a > chest X-ray PA and lateral done that showed no evidence of acute > pneumonia or congestive heart failure. On 07/14/02, he underwent an > ultrasound which was negative for deep vein thrombosis. This patient > did not take Thalomid on the day of his admittance to the hospital, > but resumed treatment shortly after with no return of symptoms. On > 07/15/02, he was discharged in stable condition. There have been no > further reports of bleeding at this time. Thedoctor has assessed the > hematochezia as related to Coumadin treatment and previously > diagnosed diverticulosis, and not to protocol therapy with Thalomid > and Epirubicin.Additional information received from the investigator > on 27Aug02 reveals that this male patient began on 07Jun02 two cycles > of therapy with Thalidomide and Epirubicin. His post cycle two > computed tomography scans revealed increase in size of liver lesion > with development of multiple new satellite nodules. On 29Jul02, the > investigator removed this patient from protocol for progressive > disease and recommended hospice care. After seeking a second opinion > from two other institutions, this patient was admitted to hospice on > 05Aug02. On 20Aug02, the investigator noted that this patient was > suffering worsening fatigue and got tired getting out of his > chair. On 25Aug02, this patient died due to disease > progression. The investigator assessed the death as not related to > study treatment and expected" > > > > > -----Original Message----- > From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] > Sent: Monday, October 02, 2017 10:36 AM > To: dev@ctakes.apache.org > Subject: Re: Enabling drugner pipeline and identifying dates > [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] > > My bad, I didn't read too closely and thought this was going to be a > > coreference patch. I don't know this FSM code that well, so I am not > an > > expert. My biggest concern at a glance is that these additions help > > find more true positives (as in your examples), can we verify that > they > > won't create false positives? > > Tim > > > > > > On Fri, 2017-09-29 at 06:25 +0000, Gandhi Rajan Natarajan wrote: > > > > > Hi Sean, > > > > > > > > Thanks again for the response. I guess its mistake from my side > > that > > > > I dint send the complete text. Did you mean that with the text I > > > > sent, the co-reference superscript-1 will be lost? > > > > > > > > Also as per your advice, We have created an issue - https://urldef > > ense.proofpoint.com/v2/url?u=https- > > 3A__urldefen&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU > > &r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=CGnNaO_ZfECB0wIfj3 > > upr01l4w_rNBG8no_VN9cFxhs&s=ikLBvXRXENiHoTgailnfsVrB- > > sy2hMgKCTVIO8iUeNE&e= > > > > se.proofpoint.com/v2/url?u=https- > > > > 3A__issues.apache.org_jira_browse_CTAKES- > > > > 2D459&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup > > - > > > > IbsIg9Q1TPOylpP9FE4GTK- > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh > > _g > > > > nqCIxz6hOzUUQ&s=Tihsi1dyNHsqsYbwyClGANfqk2Ov2nfQL2YuIV1L0CI&e= fo > > r > > > > measurement FSM changes and attached the modified file changes. > > Could > > > > someone have a look and know your thoughts please? > > > > > > > > Regards, > > > > Gandhi > > > > > > > > > > > > -----Original Message----- > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > Sent: Thursday, September 28, 2017 8:21 PM > > > > To: dev@ctakes.apache.org > > > > Cc: Miller, Timothy <timothy.mil...@childrens.harvard.edu> > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Gandhi, > > > > > > > > I don't recall you sending me that entire snippet of text. I think > > > > that I only had your single example sentence. > > > > You have discovered one of the quirks of software: "change the > > data, > > > > change the result." > > > > Ctakes is a system with many moving parts. Things that precede or > > > > follow your original example sentence will change the evaluation of > > > > that sentence. > > > > With the pipeline you are using and the full note, you should see a > > > > number (mine is 4) next to the first "thalomid" in the original > > > > example sentence. If you click that number you should see (to the > > > > right) 4 instances of "thalomid". > > > > Tim can correct me here, but maybe the coreference module ranked > > the > > > > links between "thalomid" as much higher than the rank between > > "study > > > > treatment of thalomid 200mg" and "the treatment of hepatocellular > > > > carcinoma" and discarded the encapsulating treatment texts from > > > > markables? It is probably more complex than that. > > > > > > > > > > > > > > > > > > > > we have also made some code changes in MeasurementFSM.java to > > > > > > > > identify certain measurements like '20 mg/m2' which was not > > > > > > > > identified out of the box. Should we send the code changes to > > > you > > > > > > > > so that you can consider the same to be productized ? Please > > > > > > > > advise." > > > > I don't know if you've noticed the recent emails on the dev list > > > > involving Alexandru Zbarcea. Alex has been creating or commenting > > on > > > > Jira items and attaching code for fixes and enhancements. This is > > a > > > > widely used process and is fairly easy to follow. I think that > > the > > > > following links are relevant: > > > > Working with issues: https://urldefense.proofpoint.com/v2/url?u=ht > > tp > > > > s-3A__confluence.atlassian.com_jiracoreserver073_working-2Dwith- > > > > 2Dissues- > > > > 2D861257307.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCopp > > xe > > > > FU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK- > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh > > _g > > > > nqCIxz6hOzUUQ&s=Fo-LGlsEfYJpgYcWvrDmor0B3YGxx5brZLelntVMxrU&e= > > > > Creating patches: https://urldefense.proofpoint.com/v2/url?u=http > > s- > > > > 3A__confluence.atlassian.com_crucible_creating-2Dpatch-2Dfiles- > > 2Dfor- > > > > 2Dpre-2Dcommit-2Dreviews- > > > > 2D298977458.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCopp > > xe > > > > FU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK- > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh > > _g > > > > nqCIxz6hOzUUQ&s=wVhEQCU73iEplHm34bO2AtgaDUpjAvrFe4GFx5b6pYo&e= > > > > Attaching files: https://urldefense.proofpoint.com/v2/url?u=https > > -3 > > > > A__confluence.atlassian.com_jiracorecloud_attaching-2Dfiles-2Dand- > > > > 2Dscreenshots-2Dto-2Dissues- > > > > 2D765593805.html&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCopp > > xe > > > > FU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK- > > > > OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=0kLxqu0Xu_2pjzCrVwxC4cd_1ubh > > _g > > > > nqCIxz6hOzUUQ&s=eO_HZCkkeOg8jF3CMYnMxttXRHSM16qdwPl5nTW48zQ&e= > > > > > > > > I don't know if you have a jira account and permissions for the > > > > ctakes project. An administrator may need to set that up for you. > > > > > > > > Thanks, > > > > Sean > > > > > > > > -----Original Message----- > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.co > > m] > > > > Sent: Thursday, September 28, 2017 4:09 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Sean, > > > > > > > > Thanks for the response. I was able to see the co-reference > > > > superscript using the html file that you sent. Interestingly even I > > > > was able to generate the sample HTML using piper GUI by having > > only > > > > that single line - " The patient started study treatment of > > Thalomid > > > > 200mg (days 1-21), and Epirubicin, 20 mg/m2 (days 1, 8, and 15) on > > > > 06/07/02 for the treatment of hepatocellular carcinoma. " in the > > > > input file. > > > > > > > > But when I change the input file content with the following lines: > > > > > > > > "This patient is participating in a Non-IND study; Protocol CG- > > > > 000424: "Phase I/II of Thalidomide and Epirubicin in Patients with > > > > Unresectable or Metastatic Hepatocellular Carcinoma".Information > > has > > > > been received from the investigator regarding an 82 year-old male > > > > patient who had gastrointestinal bleeding while on Thalomid, > > > > Epirubicin, and Coumadin. He had a past medical history of > > > > diverticulosis in 03/02 and a right atrial clot from > > intraventricular > > > > catheter (IVC) for which he was started on Coumadin. During the > > > > hospitalization for a right atrial clot in 03/02 hepatocellular > > > > carcinoma was first noted and he was referred to an > > oncologist. The > > > > patient started study treatment of Thalomid 200mg (days 1-21), and > > > > Epirubicin, 20 mg/m2 (days 1, 8, and 15) on 06/07/02 for the > > > > treatment of hepatocellular carcinoma. He was concomitantly > > > > receiving Cardura, Ambien (for insomnia), Megace, Coumadin, and > > > > Oxycodone. This patient presented to the emergency room with the > > > > chief complaint of hematochezia. He reported noticing bright red > > > > blood and small clots mixed in with his stool. On 07/13/02, he was > > > > admitted due to gastrointestinal bleed. The physician ordered 2 > > > > large bore intravenous lines and planned to transfuse for > > hematocrit > > > > less than 30%. Due to the INR (international normalized ratio) > > level > > > > of 3.0, Coumadin was held. He was also noted to have bilateral > > lower > > > > extremity edema with dyspnea on exertion. On 07/13/02, he had a > > > > chest X-ray PA and lateral done that showed no evidence of acute > > > > pneumonia or congestive heart failure. On 07/14/02, he > > underwent an > > > > ultrasound which was negative for deep vein thrombosis. This > > patient > > > > did not take Thalomid on the day of his admittance to the hospital, > > > > but resumed treatment shortly after with no return of symptoms. On > > > > 07/15/02, he was discharged in stable condition. There have been no > > > > further reports of bleeding at this time. Thedoctor has assessed > > the > > > > hematochezia as related to Coumadin treatment and previously > > > > diagnosed diverticulosis, and not to protocol therapy with Thalomid > > > > and Epirubicin.Additional information received from the > > investigator > > > > on 27Aug02 reveals that this male patient began on 07Jun02 two > > cycles > > > > of therapy with Thalidomide and Epirubicin. His post cycle two > > > > computed tomography scans revealed increase in size of liver lesion > > > > with development of multiple new satellite nodules. On 29Jul02, > > the > > > > investigator removed this patient from protocol for progressive > > > > disease and recommended hospice care. After seeking a second > > opinion > > > > from two other institutions, this patient was admitted to hospice > > on > > > > 05Aug02. On 20Aug02, the investigator noted that this patient was > > > > suffering worsening fatigue and got tired getting out of his > > > > chair. On 25Aug02, this patient died due to disease > > > > progression. The investigator assessed the death as not related to > > > > study treatment and expected" > > > > > > > > The co-reference superscript is lost by then. Did you tried with > > the > > > > complete text above by any chance in your piper GUI? Also I guess > > you > > > > did not notice the question on my last post - " Sean, we have also > > > > made some code changes in MeasurementFSM.java to identify certain > > > > measurements like '20 mg/m2' which was not identified out of the > > > > box. Should we send the code changes to you so that you can > > consider > > > > the same to be productized ? Please advise." > > > > > > > > > > > > Regards, > > > > Gandhi > > > > > > > > > > > > -----Original Message----- > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > Sent: Wednesday, September 27, 2017 5:53 PM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Gandhi, > > > > > > > > I am glad that you are feeling better. > > > > I don't understand why you aren't getting the same output as me. I > > > > just ran your example sentence with your piper with a fresh > > checkout > > > > and get the html below. The css follows. Copy and paste into a > > file > > > > and see if you see the corefs. > > > > > > > > ///////////////////////////////////////////////////// html, copy > > > > into file ///////////////////////////////////////////////// > > > > > > > > <!DOCTYPE html> > > > > <html> > > > > <head> > > > > <title>OneLiner Output</title> > > > > </head> > > > > <body> > > > > <link rel="stylesheet" href="ctakes.pretty.css" type="text/css" > > > > media="screen"> <h2>OneLiner</h2> <i>Text processing finished on: > > 9 > > > > 27 2017, 08:15:31</i> <hr> > > > > > > > > <div id="content"> > > > > > > > > <p> > > > > The patient <span class="AFF_" > > > > onClick="iaf('AFF_NL_EVTNL_startedNL_SPC_[before] doc timeNL_NL_')" > > > > TIP="Event ">started</span> study <span class="AFF_" > > > > onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] doc > > > > timeNL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic > > > > procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure > > > > ">treatment</span><span class="PRC"><sup>•</sup></span> of > > <span > > > > class="AFF_" > > > > onClick="iaf('AFF_NL_DRGNL_ThalomidNL_SPC_C0723668NL_SPC_[before] > > doc > > > > timeNL_NL_')" TIP="Drug ">Thalomid</span><span > > > > class="DRG"><sup>•</sup></span> <span class="AFF_" > > > > onClick="iaf('AFF_NL_EVTNL_200mgNL_SPC_[before] doc timeNL_NL_')" > > > > TIP="Event ">200mg</span><span class="UNK" > > > > onClick="crf1()"><sup>1</sup></span> ( <span class="GNR_" > > > > onClick="iaf('GNR_NL_TMXNL_daysNL_NL_')" TIP="Time ">days</span> 1 > > - > > > > 21 ) , and <span class="AFF_" > > > > onClick="iaf('AFF_NL_DRGNL_EpirubicinNL_SPC_C0014582NL_SPC_[before] > > > > doc timeNL_NL_')" TIP="Drug ">Epirubicin</span><span > > > > class="DRG"><sup>•</sup></span> , 20 mg / m2 ( <span > > > > class="GNR_" onClick="iaf('GNR_NL_TMXNL_days 1 , 8NL_NL_')" > > TIP="Time > > > > ">days 1 , 8</span> , and 15 ) on <span class="GNR_" > > > > onClick="iaf('GNR_NL_TMXNL_06 / 07 / 02NL_SPC_[CONTAINS] > > > > treatmentNL_NL_')" TIP="Time ">06 / 07 / 02</span> for the <span > > > > class="AFF_" onClick="iaf('AFF_NL_EVTNL_treatmentNL_SPC_[before] > > doc > > > > timeNL_SPC_06 / 07 / 02 > > > > [CONTAINS]NL_NL_PRCNL_treatmentNL_SPC_C0087111NL_SPC_[Therapeutic > > > > procedure]NL_SPC_[before] doc timeNL_NL_')" TIP="Event Procedure > > > > ">treatment</span><span class="PRC"><sup>•</sup></span> of > > <span > > > > class="AFF_" onClick="iaf('AFF_NL_DISNL_hepatocellular > > > > carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc > > > > timeNL_NL_')" TIP="Disorder ">hepatocellular </span><span > > > > class="AFF_" onClick="iaf('AFF_NL_DISNL_hepatocellular > > > > carcinomaNL_SPC_C2239176NL_SPC_[Liver carcinoma]NL_SPC_[before] doc > > > > timeNL_NL_EVTNL_carcinomaNL_SPC_[before] doc timeNL_NL_')" > > > > TIP="Disorder Event ">carcinoma</span><span class="DIS" > > > > onClick="crf1()"><sup>1</sup></span> . > > > > <br> > > > > > > > > </p> > > > > > > > > </div> > > > > > > > > <div id="ia"> Annotation Information </div> <script > > > > type="text/javascript"> > > > > function iaf(txt) { > > > > var aff=txt.replace( /AFF_/g,"<br><h3>Affirmed</h3>" ); > > > > var neg=aff.replace( /NEG_/g,"<br><h3>Negated</h3>" ); > > > > var unc=neg.replace( /UNC_/g,"<br><h3>Uncertain</h3>" ); > > > > var unn=unc.replace( /UNN_/g,"<br><h3>Uncertain, Negated</h3>" > > ); > > > > var ant=unn.replace( /ANT/g,"<b>Anatomical Site</b>" ); > > > > var dis=ant.replace( /DIS/g,"<b>Disease/ Disorder</b>" ); > > > > var fnd=dis.replace( /FND/g,"<b>Sign/ Symptom</b>" ); > > > > var prc=fnd.replace( /PRC/g,"<b>Procedure</b>" ); > > > > var drg=prc.replace( /DRG/g,"<b>Medication</b>" ); > > > > var evt=drg.replace( /EVT/g,"<b>Event</b>" ); > > > > var tmx=evt.replace( /TMX/g,"<b>Time</b>" ); > > > > var unk=tmx.replace( /UNK/g,"<b>Unknown</b>" ); > > > > var spc=unk.replace( > > > > /SPC_/g," " ); > > > > var prf1=spc.replace( /\[/g,"<i>" ); > > > > var prf2=prf1.replace( /\]/g,"</i>" ); > > > > var nl=prf2.replace( /NL_/g,"<br>" ); > > > > document.getElementById("ia").innerHTML = nl; > > > > } > > > > function crf1() { > > > > document.getElementById("ia").innerHTML = "<br><h3>Coreference > > > > Chain</h3>study treatment of Thalomid 200mg<br>the treatment of > > > > hepatocellular carcinoma"; > > > > } > > > > </script></body> > > > > </html> > > > > > > > > > > > > > > > > ///////////////////////////////////////////////////// css, copy > > into > > > > file named ctakes.pretty.css in same directory as > > > > html ///////////////////////////////////////////////// > > > > > > > > > > > > > > > > .GNR_ { > > > > position: relative; > > > > display: inline-block gray; > > > > border-bottom: 0.10em solid gray; > > > > } > > > > > > > > .AFF_ { > > > > position: relative; > > > > display: inline-block green; > > > > border-bottom: 0.15em solid green; > > > > } > > > > > > > > .UNC_ { > > > > position: relative; > > > > display: inline-block gold; > > > > border-bottom: 0.16em dotted gold; > > > > } > > > > > > > > .NEG_ { > > > > position: relative; > > > > display: inline-block red; > > > > border-bottom: 0.16em dashed red; > > > > } > > > > > > > > .UNN_ { > > > > position: relative; > > > > display: inline-block orange; > > > > border-bottom: 0.16em dashed orange; > > > > } > > > > > > > > .FND { > > > > color: magenta; > > > > } > > > > > > > > .DIS { > > > > color: black; > > > > } > > > > > > > > .DRG { > > > > color: red; > > > > } > > > > > > > > .PRC { > > > > color: blue; > > > > } > > > > > > > > .ANT { > > > > color: gray; > > > > } > > > > > > > > .UNK { > > > > color: gray; > > > > } > > > > > > > > [TIP] { > > > > position: relative; > > > > z-index: 2; > > > > cursor: pointer; > > > > } > > > > [TIP]::before, > > > > [TIP]::after { > > > > visibility: hidden; > > > > -ms-filter: "progid:DXImageTransform.Microsoft.Alpha(Opacity=0)"; > > > > filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=0); > > > > opacity: 0; > > > > pointer-events: none; > > > > } > > > > [TIP]::before { > > > > position: absolute; > > > > bottom: 0%; > > > > left: 100%; > > > > margin-bottom: 5px; > > > > padding: 7px; > > > > -webkit-border-radius: 3px; > > > > -moz-border-radius: 3px; > > > > border-radius: 3px; > > > > background-color: #000; > > > > background-color: hsla(0, 0%, 20%, 0.9); > > > > color: #fff; > > > > content: attr(TIP); > > > > text-align: center; > > > > font-size: 14px; > > > > line-height: 1.2; > > > > } > > > > [TIP]:hover::before, > > > > [TIP]:hover::after { > > > > visibility: visible; > > > > -ms-filter: > > "progid:DXImageTransform.Microsoft.Alpha(Opacity=100)"; > > > > filter: progid: DXImageTransform.Microsoft.Alpha(Opacity=100); > > > > opacity: 1; > > > > } > > > > > > > > div#ia { > > > > position: fixed; > > > > top: 0; > > > > right: 0; > > > > width: 20%; > > > > height: 100%; > > > > padding: 10px; > > > > overflow: auto; > > > > background-color: lightgray; > > > > } > > > > > > > > div#content { > > > > width: 79%; > > > > height: 100%; > > > > padding: 10px; > > > > overflow: auto; > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.co > > m] > > > > Sent: Wednesday, September 27, 2017 4:40 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Sean, > > > > > > > > Sorry for the delayed response as I was out of office due to > > illness. > > > > If I don't add BackwardsTimeAnnotator, I don't see any error > > related > > > > to isTraining param. But still couldn't get the superscript co- > > > > reference working. Please note that I am using the latest 4.0.1 > > jars. > > > > The piper file and console log messages are as follows: > > > > > > > > PIPER FILE: > > > > // Advanced Tokenization: Regex sectionization, BIO Sentence > > Detector > > > > (lumper), Paragraphs,Lists load AdvancedTokenizerPipeline.piper add > > > > ContextDependentTokenizerAnnotator > > > > add POSTagger > > > > // Chunkers > > > > load ChunkerSubPipe.piper > > > > // Default fast dictionary lookup > > > > load DictionarySubPipe.piper > > > > add org.apache.ctakes.drugner.ae.DrugMentionAnnotator > > > > // Cleartk Entity Attributes > > > > load AttributeCleartkSubPipe.piper > > > > // Relations > > > > load RelationSubPipe.piper > > > > // Temporal > > > > load TemporalSubPipe.piper > > > > // Coreferences > > > > load CorefSubPipe.piper > > > > //add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator > > > > // Html output > > > > add pretty.html.HtmlTextWriter > > > > // XMl writer > > > > add FileTreeXmiWriter > > > > > > > > CONSOLE LOG: > > > > > > > > 22 Sep 2017 13:59:44 INFO ClearNLPSemanticRoleLabelerAE - Finished > > > > initializing > > > > 22 Sep 2017 13:59:44 INFO CleartkAnalysisEngine - Starting > > > > initializing for Assigning Attributes > > > > 22 Sep 2017 13:59:46 INFO CleartkAnalysisEngine - Finished > > > > initializing > > > > 22 Sep 2017 13:59:46 INFO ModifierExtractorAnnotator - Starting > > > > initializing > > > > 22 Sep 2017 13:59:46 INFO ModifierExtractorAnnotator - Finished > > > > initializing > > > > 22 Sep 2017 13:59:46 INFO DegreeOfRelationExtractorAnnotator - > > > > Starting initializing > > > > 22 Sep 2017 13:59:46 INFO DegreeOfRelationExtractorAnnotator - > > > > Finished initializing > > > > 22 Sep 2017 13:59:46 INFO LocationOfRelationExtractorAnnotator - > > > > Starting initializing > > > > 22 Sep 2017 13:59:46 INFO LocationOfRelationExtractorAnnotator - > > > > Finished initializing > > > > 22 Sep 2017 13:59:46 INFO BackwardsTimeAnnotator - Starting > > > > initializing > > > > 22 Sep 2017 13:59:46 INFO BackwardsTimeAnnotator - Finished > > > > initializing > > > > 22 Sep 2017 13:59:46 INFO DocTimeRelAnnotator - Starting > > > > initializing > > > > 22 Sep 2017 13:59:48 INFO DocTimeRelAnnotator - Finished > > > > initializing > > > > 22 Sep 2017 13:59:48 INFO EventTimeRelationAnnotator - Starting > > > > initializing > > > > 22 Sep 2017 13:59:49 INFO EventTimeRelationAnnotator - Finished > > > > initializing > > > > 22 Sep 2017 13:59:49 INFO EventEventRelationAnnotator - Starting > > > > initializing > > > > 22 Sep 2017 13:59:51 INFO EventEventRelationAnnotator - Finished > > > > initializing > > > > 22 Sep 2017 13:59:51 INFO ConstituencyParser - Initializing > > > > parser... > > > > 22 Sep 2017 13:59:54 INFO RegexSectionizer - Annotating Sections > > ... > > > > 22 Sep 2017 13:59:55 INFO RegexSectionizer - Finished processing > > > > 22 Sep 2017 13:59:55 INFO SentenceDetectorAnnotatorBIO - Starting > > > > processing ... > > > > 22 Sep 2017 13:59:55 INFO SentenceDetectorAnnotatorBIO - Finished > > > > processing > > > > 22 Sep 2017 13:59:55 INFO ParagraphAnnotator - Annotating > > Paragraphs > > > > ... > > > > 22 Sep 2017 13:59:55 INFO ParagraphAnnotator - Finished processing > > > > 22 Sep 2017 13:59:55 INFO ParagraphSentenceFixer - Adjusting > > > > Sentences overlapping Paragraphs ... > > > > 22 Sep 2017 13:59:55 INFO ParagraphSentenceFixer - Finished > > > > Processing > > > > 22 Sep 2017 13:59:55 INFO ListAnnotator - Annotating Lists ... > > > > 22 Sep 2017 13:59:55 INFO ListAnnotator - Finished processing > > > > 22 Sep 2017 13:59:55 INFO ListSentenceFixer - Adjusting Sentences > > > > overlapping Lists ... > > > > 22 Sep 2017 13:59:55 INFO ListSentenceFixer - Finished Processing > > > > 22 Sep 2017 13:59:55 INFO TokenizerAnnotatorPTB - process(JCas) in > > > > org.apache.ctakes.core.ae.TokenizerAnnotatorPTB > > > > 22 Sep 2017 13:59:55 INFO ContextDependentTokenizerAnnotator - > > > > process(JCas) > > > > 22 Sep 2017 13:59:55 INFO POSTagger - process(JCas) > > > > 22 Sep 2017 13:59:55 INFO Chunker - process(JCas) > > > > 22 Sep 2017 13:59:55 INFO ChunkAdjuster - process(JCas) > > > > 22 Sep 2017 13:59:55 INFO ChunkAdjuster - process(JCas) > > > > 22 Sep 2017 13:59:55 INFO AbstractJCasTermAnnotator - Finding > > Named > > > > Entities ... > > > > 22 Sep 2017 13:59:55 INFO AbstractJCasTermAnnotator - Finished > > > > processing > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - process dev > > (JCas) > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:55 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:56 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:56 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:56 INFO DrugMentionAnnotator - -1 > > > > 22 Sep 2017 13:59:56 INFO ClearNLPDependencyParserAE - Dependency > > > > parser starting with thread:pool-2-thread-1 > > > > 22 Sep 2017 13:59:56 INFO ClearNLPDependencyParserAE - Dependency > > > > parser ending with thread:pool-2-thread-1 > > > > 22 Sep 2017 13:59:56 INFO ClearNLPSemanticRoleLabelerAE - Starting > > > > processing ... > > > > 22 Sep 2017 13:59:56 INFO ClearNLPSemanticRoleLabelerAE - Finished > > > > processing > > > > 22 Sep 2017 13:59:56 INFO CleartkAnalysisEngine - Assigning > > > > Attributes ... > > > > 22 Sep 2017 13:59:56 INFO CleartkAnalysisEngine - Finished > > Assigning > > > > Attributes > > > > 22 Sep 2017 13:59:56 INFO ModifierExtractorAnnotator - Starting > > > > processing ... > > > > 22 Sep 2017 13:59:56 INFO ModifierExtractorAnnotator - Finished > > > > processing > > > > 22 Sep 2017 13:59:56 INFO DegreeOfRelationExtractorAnnotator - > > > > Starting processing ... > > > > 22 Sep 2017 13:59:56 INFO DegreeOfRelationExtractorAnnotator - > > > > Finished processing > > > > 22 Sep 2017 13:59:56 INFO LocationOfRelationExtractorAnnotator - > > > > Starting processing ... > > > > 22 Sep 2017 13:59:57 INFO LocationOfRelationExtractorAnnotator - > > > > Finished processing > > > > 22 Sep 2017 13:59:57 INFO BackwardsTimeAnnotator - Starting > > > > processing ... > > > > 22 Sep 2017 13:59:57 INFO BackwardsTimeAnnotator - Finished > > > > processing > > > > 22 Sep 2017 13:59:57 INFO DocTimeRelAnnotator - Starting > > processing > > > > ... > > > > 22 Sep 2017 13:59:58 INFO DocTimeRelAnnotator - Finished > > processing > > > > 22 Sep 2017 13:59:58 INFO EventTimeRelationAnnotator - Starting > > > > processing ... > > > > 22 Sep 2017 13:59:59 INFO EventTimeRelationAnnotator - Finished > > > > processing > > > > 22 Sep 2017 13:59:59 INFO EventEventRelationAnnotator - Starting > > > > processing ... > > > > 22 Sep 2017 13:59:59 INFO EventEventRelationAnnotator - Finished > > > > processing > > > > 22 Sep 2017 13:59:59 INFO MaxentParserWrapper - Started > > processing: > > > > test > > > > 22 Sep 2017 14:00:02 INFO MaxentParserWrapper - Done parsing: test > > > > 22 Sep 2017 14:00:03 INFO MentionClusterCoreferenceAnnotator - > > > > Finding Coreferences ... > > > > 22 Sep 2017 14:00:03 INFO MentionClusterCoreferenceAnnotator - > > > > Finished. > > > > 22 Sep 2017 14:00:03 INFO HtmlTextWriter - Writing HTML to > > > > D:\Gandhi\ArisG\cTAKES\apache-ctakes- > > > > 4.0.0\bin_old\test_output\test.txt.pretty.html ... > > > > 22 Sep 2017 14:00:03 INFO HtmlTextWriter - Finished Writing > > > > 22 Sep 2017 14:00:03 INFO FileTreeXmiWriter - Writing XMI to > > > > D:\Gandhi\ArisG\cTAKES\apache-ctakes- > > > > 4.0.0\bin_old\test_output\test.txt.xmi ... > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > decreasingWithTrace(51) > > > > WARNING: Message count: 1; Feature > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > references. These will be serialized in duplicate. Message count > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > logging level for stacktrace. > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > decreasingWithTrace(51) > > > > WARNING: Message count: 2; Feature > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > references. These will be serialized in duplicate. Message count > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > logging level for stacktrace. > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > decreasingWithTrace(51) > > > > WARNING: Message count: 4; Feature > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > references. These will be serialized in duplicate. Message count > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > logging level for stacktrace. > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > decreasingWithTrace(51) > > > > WARNING: Message count: 8; Feature > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > references. These will be serialized in duplicate. Message count > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > logging level for stacktrace. > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > decreasingWithTrace(51) > > > > WARNING: Message count: 16; Feature > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > references. These will be serialized in duplicate. Message count > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > logging level for stacktrace. > > > > Sep 22, 2017 2:00:03 PM org.apache.uima.util.MessageReport > > > > decreasingWithTrace(51) > > > > WARNING: Message count: 32; Feature > > > > org.apache.ctakes.typesystem.type.textsem.Predicate:relations is > > > > marked multipleReferencesAllowed=false, but it has multiple > > > > references. These will be serialized in duplicate. Message count > > > > indicates messages skipped to avoid potential flooding. Set FINE > > > > logging level for stacktrace. > > > > 22 Sep 2017 14:00:03 INFO FileTreeXmiWriter - Finished Writing > > > > > > > > > > > > Sean, we have also made some code changes in MeasurementFSM.java > > to > > > > identify certain measurements like '20 mg/m2' which was not > > > > identified out of the box. Should we send the code changes to you > > so > > > > that you can consider the same to be productized ? Please advise. > > > > > > > > Regards, > > > > Gandhi > > > > > > > > > > > > -----Original Message----- > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > Sent: Friday, September 22, 2017 6:54 PM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Gandhi, > > > > > > > > You don't need to add BackwardsTimeAnnotator to your piper. It is > > > > added by the TemporalSubPipe.piper. The error that you are seeing > > > > regarding training is very strange, but you can try adding this > > line > > > > to the top of the file: > > > > set isTraining=false > > > > > > > > Can you run a sample file with your piper and send me the log > > > > statements? It might help me figure out what is going on. > > > > > > > > > > > > > > > > > > > > is there any doc or guide on how to start writing our own > > > > > > > > annotator. > > > > There are two example annotators in the ctakes-examples project > > under > > > > the ae/ directory. You can look at those, but I recommend that you > > > > look at some information on Uimafit, which can be used to create > > new > > > > annotators: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__uima.apache.or > > g_ > > > > d_uimafit- > > > > 2D2.1.0_tools.uimafit.book.pdf&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwE > > W1 > > > > 4JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=Ol > > Z5 > > > > SUTgU94HjHE8vZDkXv8hjaaa9qEpAlfZjU52Ymk&s=0rIPMY5osSxL4J9gMymmv0bHs > > BX > > > > imd0yb1FmUp4uT-A&e= > > > > An introduction to creating Analysis Engines (Annotators) is on > > page > > > > 5. > > > > > > > > Coding style is individualistic, but below is a rubberstamp that I > > > > use to get started: > > > > > > > > import org.apache.ctakes.core.pipeline.PipeBitInfo; > > > > import org.apache.log4j.Logger; > > > > import org.apache.uima.UimaContext; > > > > import > > > > org.apache.uima.analysis_engine.AnalysisEngineProcessException; > > > > import org.apache.uima.fit.component.JCasAnnotator_ImplBase; > > > > import org.apache.uima.jcas.JCas; > > > > import org.apache.uima.resource.ResourceInitializationException; > > > > > > > > /** > > > > * @author SPF , chip-nlp > > > > * @version %I% > > > > * @since 9/22/2017 > > > > */ > > > > @PipeBitInfo( > > > > name = "Template", > > > > description = "For Example.", role = > > PipeBitInfo.Role.ANNOTATOR > > > > ) > > > > final public class Template extends JCasAnnotator_ImplBase { > > > > > > > > static private final Logger LOGGER = Logger.getLogger( > > "Template" > > > > ); > > > > > > > > /** > > > > * {@inheritDoc} > > > > */ > > > > @Override > > > > public void initialize( final UimaContext context ) throws > > > > ResourceInitializationException { > > > > // Always call the super first > > > > super.initialize( context ); > > > > // place AE initialization code here > > > > } > > > > > > > > /** > > > > * {@inheritDoc} > > > > */ > > > > @Override > > > > public void process( final JCas jCas ) throws > > > > AnalysisEngineProcessException { > > > > LOGGER.info( "Processing ..." ); > > > > // Place AE processing code here > > > > LOGGER.info( "Finished." ); > > > > } > > > > } > > > > > > > > > > > > > > > > If you use IntelliJ as your ide you can create a file template with > > > > these parameters: > > > > > > > > #if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")package > > > > ${PACKAGE_NAME};#end > > > > > > > > import org.apache.ctakes.core.pipeline.PipeBitInfo; > > > > import org.apache.log4j.Logger; > > > > import org.apache.uima.UimaContext; > > > > import > > > > org.apache.uima.analysis_engine.AnalysisEngineProcessException; > > > > import org.apache.uima.fit.component.JCasAnnotator_ImplBase; > > > > import org.apache.uima.jcas.JCas; > > > > import org.apache.uima.resource.ResourceInitializationException; > > > > > > > > #parse("File Header.java") > > > > @PipeBitInfo( > > > > name = "${NAME}", > > > > #if ( ${PROJECT_NAME} != "")description = "For > > > > ${PROJECT_NAME}.",#end > > > > role = PipeBitInfo.Role.ANNOTATOR > > > > ) > > > > final public class ${NAME} extends JCasAnnotator_ImplBase { > > > > > > > > static private final Logger LOGGER = Logger.getLogger( "${NAME}" > > > > ); > > > > > > > > /** > > > > * {@inheritDoc} > > > > */ > > > > @Override > > > > public void initialize( final UimaContext context ) throws > > > > ResourceInitializationException { > > > > // Always call the super first > > > > super.initialize( context ); > > > > // place AE initialization code here > > > > } > > > > > > > > /** > > > > * {@inheritDoc} > > > > */ > > > > @Override > > > > public void process( final JCas jCas ) throws > > > > AnalysisEngineProcessException { > > > > LOGGER.info( "Processing ..." ); > > > > // Place AE processing code here > > > > LOGGER.info( "Finished." ); > > > > } > > > > } > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.co > > m] > > > > Sent: Friday, September 22, 2017 2:23 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Sean, > > > > > > > > Thanks again for the detailed response. > > > > > > > > I still couldn't manage to get superscript-1 co-reference in piper > > > > GUI. Also I'm not able to use "BackwardsTimeAnnotator" in piper > > GUI > > > > as it gives me the below error: > > > > > > > > org.apache.uima.resource.ResourceInitializationException: > > > > Initialization of annotator class > > > > "org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator" > > > > failed. (Descriptor: <unknown>) > > > > at > > > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.i > > ni > > > > tializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:271) > > > > at > > > > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.i > > ni > > > > tialize(PrimitiveAnalysisEngine_impl.java:170) > > > > Caused by: java.lang.IllegalArgumentException: Please specify > > > > PARAM_IS_TRAINING - unable to infer it from context > > > > at > > > > org.cleartk.ml.CleartkAnnotator.initialize(CleartkAnnotator.java:10 > > 9) > > > > > > > > Somewhere in old mails it's mentioned that it's because of missing > > > > dependencies so I tried adding ClearTkAnnotator with no luck yet. > > My > > > > piper file is as follows: > > > > > > > > load AdvancedTokenizerPipeline.piper > > > > add ContextDependentTokenizerAnnotator > > > > add POSTagger > > > > load ChunkerSubPipe.piper > > > > load DictionarySubPipe.piper > > > > add org.apache.ctakes.drugner.ae.DrugMentionAnnotator > > > > load AttributeCleartkSubPipe.piper > > > > load RelationSubPipe.piper > > > > load TemporalSubPipe.piper > > > > load CorefSubPipe.piper > > > > add org.apache.ctakes.temporal.ae.BackwardsTimeAnnotator > > > > add pretty.html.HtmlTextWriter > > > > add FileTreeXmiWriter > > > > > > > > Any suggestion on this? Also I'm using all the latest 4.0.1 cTAKES > > > > Jars. Regarding the identification of Names, will dig deep on what > > > > you have mentioned. > > > > > > > > Sorry to ask this as you already mentioned that there are no > > detailed > > > > docs for cTAKES. But is there any doc or guide on how to start > > > > writing our own annotator if required? It not, Is there any simple > > > > annotator that you would suggest us to look into to get better > > > > understanding on annotators for us to proceed further. Thanks in > > > > advance. > > > > > > > > Regards, > > > > Gandhi > > > > > > > > > > > > -----Original Message----- > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > Sent: Thursday, September 21, 2017 7:59 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Gandhi, > > > > > > > > > > > > > > > > > > > > We guess we are missing out on something as we could not find co- > > > > > > > > references for "200mg". Should we add anymore piper for this? > > > > The piper commands that I sent has everything to obtain > > > > coreferences. I use it regularly - it is what I used on your > > example > > > > sentence to get the coreferences that I mentioned. > > > > > > > > > > > > > > > > > > > > Also the change mentioned in the thread ... > > > > That is a very old thread and I don't think that it applies to what > > > > you are trying to do. > > > > > > > > > > > > > > > > > > > > We also have a requirement to identify the patient names and sex > > > > As James said, ctakes isn't really meant to do this. Ctakes is > > > > catered toward extracting clinical data, and to this point names > > have > > > > not fallen into that category. It is more a task for general > > > > nlp. There is an opennlp model that can identify names and a few > > > > others (I used to see names using GATE). ctakes has wrapped > > opennlp > > > > for other tasks and you should be able to do the same to adapt an > > > > engine for names into ctakes. > > > > > > > > > > > > > > > > > > > > cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 > > > or > > > > > > > > 06 > > > > > > > > / 07 / 02 or 27Aug2002 > > > > As Chen mentioned, the BackwardTimeAnnotator module uses an ML > > model > > > > trained on gold data. It isn't perfect. You can add another time > > > > annotator on top of this to get some of the more simply formatted > > > > date mentions - there are a lot of them out there. Personally I > > have > > > > used jchronic as it can be easily tweaked to recognize medically- > > > > relevant temporal expressions relating to surgery, pharmacology, > > etc. > > > > > > > > Sean > > > > > > > > > > > > -----Original Message----- > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > Sent: Wednesday, September 20, 2017 8:50 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] [SUSPICIOUS] > > > > > > > > Hi Gandhi, > > > > > > > > I don't have time to go through all of this right now, but I will > > try > > > > to get to it soon. > > > > > > > > Make sure that you are running the latest version in trunk. > > > > > > > > Sean > > > > > > > > -----Original Message----- > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.co > > m] > > > > Sent: Wednesday, September 20, 2017 7:03 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] > > > > > > > > Hi, Could someone help me out on the below queries please? > > > > > > > > Regards, > > > > Gandhi > > > > > > > > -----Original Message----- > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.co > > m] > > > > Sent: Tuesday, September 19, 2017 8:51 PM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] > > > > > > > > Hi Sean, > > > > > > > > Thanks again for the detailed and prompt response. We were able to > > > > run the piper GUI as per your advice. But in the output (The > > patient > > > > started study treatment of Thalomid 200mg ( days 1 - 21 ) , and > > > > Epirubicin ,20 mg / m2 ( days 1 , 8 , and 15 ) on 06 / 07 / 02 for > > > > the treatment of hepatocellular carcinoma.), we were not able to > > find > > > > superscript-1 as you mentioned earlier but could find superscript- > > 2, > > > > 3 etc. We guess we are missing out on something as we could not > > find > > > > co-references for "200mg". Should we add anymore piper for this? > > > > > > > > Also the change mentioned in the thread - https://urldefense.proofp > > oint.com/v2/url?u=https- > > 3A__urldefense.proofpoi&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMS > > dioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=CGnNaO_ > > ZfECB0wIfj3upr01l4w_rNBG8no_VN9cFxhs&s=oyoapyLVicnNRWKMjbbTQY8yFVOT > > xRTEi4jshTvpB5w&e= > > > > nt.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod- > > > > 5Fmbox_ctakes-2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1- > > > > 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com- > > > > 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67G > > vl > > > > GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2L > > EB > > > > LyOfXR3ikwOL0&s=GzhvIkBu4cgyzYN9n6VLe2rz4sJhJzMxDcWyB0BkqAc&e= is > > > > required for the drug-ner module to identify drug-ner annotations. > > > > > > > > 1) We also have a requirement to identify the patient names and sex > > > > available in narrative texts. Please let us know how to achieve the > > > > same as its not identifying the proper nouns and the relationship > > > > with the patient? > > > > Eg. "This male patient named Tom Hardy aged 35 years is > > participating > > > > in a Non-IND study" > > > > > > > > 2) cTAKES is unable to identify the dates like 20Aug02 or 20/Aug/02 > > > > or 06 / 07 / 02 or 27Aug2002 as in the below example. Please let us > > > > know how to enhance the system to identify such date patterns. > > > > E.g " On 20Aug02, the investigator noted that this patient was > > > > suffering worsening fatigue and got tired getting out of his chair" > > > > > > > > Regards, > > > > Gandhi > > > > > > > > > > > > -----Original Message----- > > > > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > > > > Sent: Monday, September 18, 2017 10:02 PM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] > > > > > > > > Hi Gandhi, > > > > > > > > > > > > > > > > > > > > So in this case will be able to see drug attributes in the output > > > > > > > > XML? > > > > As long as you have the DrugMentionAnnotator in your pipeline you > > > > should be able to find drug attributes in the xml output file. > > > > > > > > > > > > > > > > > > > > we also saw some code changes needs to be done to use drug-ner > > > > > > > > module. Is it still valid? > > > > As far as I know there aren't any necessary code changes to get > > drug > > > > ner running. However, I do not normally use drugner so I can't say > > > > for certain. > > > > > > > > > > > > > > > > > > > > Also you mentioned that the drun-ner module is out of date > > > > It can still be used and will produce annotations. All that I > > meant > > > > was that there may not be many people out there using it. It is > > not > > > > part of the default pipeline. > > > > > > > > > You also mentioned that when you run the sentence, the date was > > > > identified. Where and how exactly did you ran it so that we can > > check > > > > the same? > > > > I run the following in a piper file because I am interested in a > > lot > > > > of modules (I added drugner just for you): > > > > > > > > // Advanced Tokenization: Regex sectionization, BIO Sentence > > Detector > > > > (lumper), Paragraphs, Lists load AdvancedTokenizerPipeline.piper > > add > > > > ContextDependentTokenizerAnnotator > > > > add POSTagger > > > > // Chunkers > > > > load ChunkerSubPipe.piper > > > > // Default fast dictionary lookup > > > > load DictionarySubPipe.piper > > > > add org.apache.ctakes.drugner.ae.DrugMentionAnnotator > > > > // Cleartk Entity Attributes > > > > load AttributeCleartkSubPipe.piper > > > > // Relations > > > > load RelationSubPipe.piper > > > > // Temporal > > > > load TemporalSubPipe.piper > > > > // Coreferences > > > > load CorefSubPipe.piper > > > > // Html output > > > > add pretty.html.HtmlTextWriter > > > > > > > > For information on piper files, see https://urldefense.proofpoint.c > > om > > > > /v2/url?u=https- > > 3A__cwiki.apache.org_confluence_display_CTAKES_Piper- > > > > 2BFiles&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs > > 67 > > > > GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_U > > G2 > > > > LEBLyOfXR3ikwOL0&s=9ueuHYwEywok8byBXEkVjmTWiChmaIY3ryB4Pi6ajRo&e= > > > > I run it in my IDE with: > > > > org.apache.ctakes.core.pipeline.PiperFileRunner -Xmx3G -p > > > > <FileAsAbove>.piper -i org/apache/ctakes/examples/notes -o > > > > <OutputDir> --user <MyUmlsUser> --pass <MyUmlsPass> You can run it > > by > > > > command line by substituting > > > > "org.apache.ctakes.core.pipeline.PiperFileRunner -Xmx3G" with > > > > "bin/runPiperFile". > > > > You can also run it through a ctakes 4.01 (trunk) gui. See https:/ > > /urldefense.proofpoint.com/v2/url?u=https- > > 3A__u&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67 > > GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=CGnNaO_ZfECB0wIfj3upr01l4 > > w_rNBG8no_VN9cFxhs&s=Yj81jO0x5xtcNiI74Vac2NWCS9v3FdzyULgAepC4xHE&e= > > > > rldefense.proofpoint.com/v2/url?u=https- > > > > 3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile- > > > > 2BSubmitter- > > > > 2BGUI&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67 > > Gv > > > > lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=JoUDRZHu91gGMslwknPzTQC_UG2 > > LE > > > > BLyOfXR3ikwOL0&s=VWIrXrfA2dZ8KHOdoizJo-nTx7nPSy4GDOZ7IxQteIQ&e= > > > > > > > > > > > > > > > > > > > > I'm not able to see any clickable option in HTML output > > > > You must have the HtmlTextWriter at the end of your pipeline to > > > > produce html files. To keep the xml file output, place "add > > > > FileTreeXmiWriter" at the end of the piper. > > > > > > > > > > > > > > > > > > > > Apologizes for too many > > > > No worries, we are happy to have your interest! > > > > > > > > Sean > > > > > > > > > > > > -----Original Message----- > > > > From: Gandhi Rajan Natarajan [mailto:gandhi.natara...@arisglobal.co > > m] > > > > Sent: Saturday, September 16, 2017 7:01 AM > > > > To: dev@ctakes.apache.org > > > > Subject: RE: Enabling drugner pipeline and identifying dates > > > > [EXTERNAL] > > > > > > > > Hi Sean, > > > > > > > > Thanks again for the prompt response. Appreciate your input on > > adding > > > > DrugMentionAnnotator. Actually, we are relying on pretty printer > > > > output just to understand the analysis. Our logic to extract > > > > disorders and findings are based on the XML file generated by https > > ://urldefense.proofpoint.com/v2/url?u=https- > > 3A_&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67Gv > > lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=CGnNaO_ZfECB0wIfj3upr01l4w_ > > rNBG8no_VN9cFxhs&s=aO43FWr9I5GgOxclRRHxp8mWA57GzOq0qky78uyu39E&e= > > > > /urldefense.proofpoint.com/v2/url?u=https- > > > > 3A__github.com_healthnlp_examples_blob_master_ctakes-2Dtemporal- > > > > 2Ddemo_src_main_java_org_apache_ctakes_web_client_servlet_DemoServl > > et > > > > .java&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67 > > Gv > > > > lGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o- > > > > BKBn7UcbfF660CEBI&s=g8UzBHRoOyn1hoRABKSC6EtPMvwOSSggviRmWCHKti4&e= > > > > So in this case will be able to see drug attributes in the output > > > > XML? > > > > > > > > In one of the old post (https://urldefense.proofpoint.com/v2/url?u= > > ht > > > > tp-3A__mail-2Darchives.apache.org_mod-5Fmbox_ctakes- > > > > 2Duser_201403.mbox_-253CCAL6WimrJ-5Fmm1- > > > > 2BXyggBZv62diYuWP0ScA9VEV8mNHGWe4hSNHQg-40mail.gmail.com- > > > > 253E&d=DwIFAg&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67G > > vl > > > > GZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=_MJKBj93YJdd5aa84dBvqtg6o- > > > > BKBn7UcbfF660CEBI&s=iT_1UGR98APO80UaZsaCBHseMqF4M4PfItgokD27r5c&e= > > ) > > > > we also saw some code changes needs to be done to use drug-ner > > > > module. Is it still valid? Also you mentioned that the drun-ner > > > > module is out of date which means it cannot be used or it may not > > > > provide accurate analysis? Also what changes needs to be done to > > > > bring it up to date so that we can try the same if you can assist? > > > > > > > > You also mentioned that when you run the sentence, the date was > > > > identified. Where and how exactly did you ran it so that we can > > check > > > > the same? Also regarding you explanation on corefernce, I'm not > > able > > > > to see any clickable option in HTML output. So wanted to understand > > > > how can we run and check that too. > > > > > > > > Apologizes for too many questions as we are just a week old in NLP > > > > and cTAKES. Thanks in advance. > > > > > > > > Regards, > > > > Gandhi > > > > > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > > > > > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > > > > > > > > This email and any files transmitted with it are confidential and > > > > intended solely for the use of the individual or entity to whom > > they > > > > are addressed. If you are not the named addressee you should not > > > > disseminate, distribute or copy this e-mail. Please notify the > > sender > > > > or system manager by email immediately if you have received this e- > > > > mail by mistake and delete this e-mail from your system. If you are > > > > not the intended recipient you are notified that disclosing, > > copying, > > > > distributing or taking any action in reliance on the contents of > > this > > > > information is strictly prohibited and against the law. > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. Please notify the sender > or system manager by email immediately if you have received this e- > mail by mistake and delete this e-mail from your system. If you are > not the intended recipient you are notified that disclosing, copying, > distributing or taking any action in reliance on the contents of this > information is strictly prohibited and against the law.