RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Gandhi Rajan Natarajan
Hi James, Thanks for the response. As you said its definitely not a showstopper. We encountered this measurement in the narratives we were testing and thought of fixing this. That’s the whole idea. Also as per the code, 'fslashCondition' added before 2nd token should avoid false positives is wh

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Gandhi Rajan Natarajan
Hi Sean, Completely agree with you on this. Thanks for your support. Regards, Gandhi -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, October 03, 2017 9:56 PM To: dev@ctakes.apache.org Subject: RE: Enabling drugner pipeline and identifying dat

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Gandhi Rajan Natarajan
Thanks for the update Sean. Please keep us posted so that we can test the same once your fix is ready. Regards, Gandhi -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, October 03, 2017 10:04 PM To: dev@ctakes.apache.org Subject: RE: Enabling

building cTAKES (discussion transferred from CTAKES-445

2017-10-03 Thread James Masanz
A question was asked within JIRA issue CTAKES-445 about building cTAKES that is more general than the topic of CTAKES-445, so I'm transferring that to this mailing list. It started with the following question how someone is able to provide complet

Re: Missing resources for script that extracts markables from a corpus for analysis [EXTERNAL]

2017-10-03 Thread Alexandru Zbarcea
Hi Tim, That's great news. If you think there are sample notes that can be used, I can start working on the Lucene index and slowly build the UTest for them. I have created CTAKES-462[1] where we can track this work. Looking into the ctakes-examples-res, what I can find is: $ find . -type f | gr

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Finan, Sean
Excellent, thanks -Original Message- From: James Masanz [mailto:masanz.ja...@gmail.com] Sent: Tuesday, October 03, 2017 12:35 PM To: dev@ctakes.apache.org Subject: Re: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] FWIW, I started taking a look at the patch. (It

Re: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread James Masanz
FWIW, I started taking a look at the patch. (It's in code that I'm not that familiar with, so a quick glance isn't sufficient for me.) I did a search in UMLS for m2 in the terminologies commonly used by cTAKES to see if adding m2 could result in marking something as a measurement when it's not - an

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Finan, Sean
Hi Gandhi, I have one discovery pertaining to the coref items so far. Your first coreference (#1) is not appearing in the html because it consists only of a "generic" item: "this patient". Coreference: This patient , This patient , This patient , this patient , this patient , this patient , this

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Finan, Sean
Hi Gandhi, Ctakes is a purely volunteer effort, so there are never any guarantees ... If nobody looks at the value and unit jira and patch this week then I will try to get to it asap. Thanks for letting us use your example note! Sean -Original Message- From: Gandhi Rajan Natarajan [mai

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Gandhi Rajan Natarajan
Hi Sean, Will this JIRA issue - https://issues.apache.org/jira/browse/CTAKES-459 be looked up by someone as Tim mentioned? The paragraph we sent earlier can be in the example notes provided the protocol number is masked/modified. Regards, Gandhi -Original Message- From: Finan, Sean

Re: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]

2017-10-03 Thread Jeff Headley
That's great Sean. Thanks for all the help. On Tue, Oct 3, 2017 at 9:37 AM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > You can find all kinds of background information on the web with a search > like "nlp tokenization". You can look at > org.apache.ctakes.gui.dictionary.util.TextT

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

2017-10-03 Thread Finan, Sean
Thanks Tim! I was looking for that one but couldn't find it. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Tuesday, October 03, 2017 10:03 AM To: dev@ctakes.apache.org Subject: Re: Enabling drugner pipeline and identifying dates [EXTERNAL]

Re: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

2017-10-03 Thread Alexandru Zbarcea
This is very informative. Thank you Tim Alex On Oct 3, 2017 10:06, "Miller, Timothy" < timothy.mil...@childrens.harvard.edu> wrote: > Here's the most recent publication, which describes the system in > ctakes 4.0 and later: > http://www.sciencedirect.com/science/article/pii/S1532046417300850 > T

Re: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS] [SUSPICIOUS]

2017-10-03 Thread Miller, Timothy
Here's the most recent publication, which describes the system in ctakes 4.0 and later: http://www.sciencedirect.com/science/article/pii/S1532046417300850 Tim On Tue, 2017-10-03 at 13:52 +, Finan, Sean wrote: > > > > With the changes in Input, the co-reference between all the > > entities sho

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Finan, Sean
Hi Gandhi, Thank you for asking. There is no action item for you concerning the coreference output that you see. However, if you would like to help the community understand how the module works (input and output), maybe you could do something like run the pipeline on your original sentence,

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2017-10-03 Thread Finan, Sean
> With the changes in Input, the co-reference between all the entities should > still be preserved right? No. One of the experts can better explain this, but the coreference module works with "best match" chains. With one sentence of text, term (Markable) A may have a best match with term B.

RE: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]

2017-10-03 Thread Finan, Sean
You can find all kinds of background information on the web with a search like "nlp tokenization". You can look at org.apache.ctakes.gui.dictionary.util.TextTokenizer in the ctakes-gui module to see how the dictionary creator does it. You can run .getTokenizedText( text ) to get a tokenized s

Re: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]

2017-10-03 Thread Jeff Headley
Thanks Sean. Not quite, sorry for the confusion. We keep the default dictionary hsqldb. We just empty the CUI_TERMS, RXNORM, PREFTERM, and TUI tables and move over data from a sql server db. I don't seem to recall doing anything with a tcount column. I'll have to check our code tonight. That could

RE: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]

2017-10-03 Thread Finan, Sean
Ok, let me see if I understand your current setup: Ctakes 4.0 fast lookup, Dictionary configuration file points to an sql server, Sql server uses cui_terms (cui, rword, rindex, tcount, text) and perhaps other secondary tables ... Now that I write out the column names, I have a thought. Is it p

Re: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]

2017-10-03 Thread Jeff Headley
I updated our pom to use the same hsqldb version as what I saw in the ctakes lib folder. The data coming in is from a SQL Server database. On Tue, Oct 3, 2017 at 8:45 AM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Jeff, > > I don't think that a custom dictionary should cause a nu

RE: NPE after upgrade in DefaultJCASTermAnnotator [EXTERNAL]

2017-10-03 Thread Finan, Sean
Hi Jeff, I don't think that a custom dictionary should cause a null pointer exception on that line unless you have an odd null character in text or something of that ilk. One thing that changed in ctakes 4.0 is the version of hsqldb that is being used for the dictionary database. I don’t know

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS]

2017-10-03 Thread Gandhi Rajan Natarajan
Hi Tim/Sean, Is this an action item on us? If yes, Could someone give us some valid inputs to test the same? Is someone else going to review this again? Regards, Gandhi -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Monday, October 02, 2017

RE: Enabling drugner pipeline and identifying dates [EXTERNAL] [SUSPICIOUS] [SUSPICIOUS]

2017-10-03 Thread Gandhi Rajan Natarajan
Hi Sean, I still have some doubts on this. If I run the piper file with the complete text I sent earlier, I could see only superscript - 4 for Thalomid and the co-reference of this to "treatment of hepatocellular carcinoma" is still lost. Also I don’t see any superscript with number-1 too. With