RE: How to add a new dictionary database to cTAKES

2014-02-28 Thread Finan, Sean
Hi Abhishek, You have some interesting timing ... I can give you the xml specifications that you require if you send me the format of your dictionary. Since you are new to the current dictionary module setup, I might also have a simpler solution for you ... A couple of days ago I checked a new

RE: getSeverity etc. for relation extractor

2014-03-20 Thread Finan, Sean
> 1) Should we populate IdentifiedAnnotation.severity() and bodylocationof() > Directly in RelationExtractorAnnotator instead of the template filler? One minor issue might be the fact that multiple relations of the same type can (and most likely will be) created for a single Identified Anno

RE: getSeverity etc. for relation extractor

2014-03-21 Thread Finan, Sean
> until we have a definite, well-defined need (from a user). "Rash on arm and leg" > I don't follow what you mean by your item B) below [Rash].getLocationRelation() > [Rash : Arm] [Rash].getLocation() > [Arm] -Original Message- From: Masanz, James J. [mailto:masanz.ja...@mayo.edu] S

RE: getSeverity etc. for relation extractor

2014-03-21 Thread Finan, Sean
what it was in cTAKES 3.1 and find out if this is a bug in TemplateFillerAnnotator or something else. -- James -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, March 21, 2014 12:30 PM To: dev@ctakes.apache.org Subject: RE: getSeverity etc. for relation ex

RE: getSeverity etc. for relation extractor

2014-03-24 Thread Finan, Sean
leg" and I get just one location_of relation. And again no location_of relations for "rash on arm and leg" Sean, what was the exact phrase you used with the incubator version? (or was that a while ago and lost) -----Original Message- From: Finan, Sean [mailto:sean.fi...@

RE: "Temporal Information Extraction" package has compile time error

2014-03-27 Thread Finan, Sean
Hi Manu, Speaking for the developers of that module, we are excited that you and others in the community are starting to show so much interest in temporal information extraction - enough to attempt builds and trial runs. The Temporal module is still in an "academic" experimental phase and there

RE: suggestion for default pipelines

2014-04-15 Thread Finan, Sean
+1 I think that a factory is a great idea. I (personally) dislike the descriptor schema, but I think that deprecation is the way to go until a replacement comes along. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Tuesday, April 15, 201

RE: errors when run BagOfCUIsGenerator.java

2014-04-16 Thread Finan, Sean
Try to open https://uts-ws.nlm.nih.gov If that works then try https://uts-ws.nlm.nih.gov/restful/isValidctakes.umlsuser and see if you get a message like "This XML file does not appear to have any style information associated with it. The document tree is shown below." If that works and you

RE: lvg entries

2014-04-17 Thread Finan, Sean
Those variants are not used by the dictionary lookup. I did look at them to see if it was worthwhile for the new dictionary, but they are all over the place so I passed. From: Miller, Timothy [timothy.mil...@childrens.harvard.edu] Sent: Thursday, April

RE: lvg entries

2014-04-18 Thread Finan, Sean
+1 false -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Friday, April 18, 2014 2:54 PM To: dev@ctakes.apache.org Subject: Re: lvg entries Thanks for tracking that down Andy. I am making a pass at UimaFit-izing the configuration parameters fo

RE: new dictionary lookup {was RE: lvg entries]

2014-04-22 Thread Finan, Sean
rser) > > -----Original Message- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Thursday, April 17, 2014 12:52 PM > To: dev@ctakes.apache.org > Subject: RE: lvg entries > > Those variants are not used by the dictionary lookup. I did look at > th

RE: ytex merged into trunk

2014-04-28 Thread Finan, Sean
Hi Vijay, I did a checkout this morning and I'm getting compile errors from Maven. If I just run mvn compile then I get an error while building ytex claiming that the package has not been created. Is there a reversed dependency? If I run mvn compile package then ytex seems to run through, but

RE: ytex merged into trunk

2014-04-28 Thread Finan, Sean
estigate. > > -vj > > > On Mon, Apr 28, 2014 at 11:00 AM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > >> Hi Vijay, >> >> I did a checkout this morning and I'm getting compile errors from Maven. >> >> If I just run mvn compile

RE: Explict version numbers instead of ranges in pom.xml

2014-05-02 Thread Finan, Sean
+1 > so I was planning to update -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Friday, May 02, 2014 12:27 PM To: dev@ctakes.apache.org Subject: Explict version numbers instead of ranges in pom.xml Hi, Are there any opposition to using explicitly depend

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-11 Thread Finan, Sean
>. The newer NER should have in its name the Behavior... I agree, but the *2 module is a complete replacement for the current lookup. It does not (really) have any different behavior, just a different implementation and performance. We plan to swap out the old with the new in the next release

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-11 Thread Finan, Sean
wsgroup. On Wed, Jun 11, 2014 at 9:21 AM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > >. The newer NER should have in its name the Behavior... > > I agree, but the *2 module is a complete replacement for the current > lookup. It does not (really) have any

RE: Preparing for an Apache cTAKES 3.2 Release?

2014-06-16 Thread Finan, Sean
pful to >>>> have thorough documentation on the dictionary lookup, how to >>>> configure it, and how to create new dictionaries. I would venture >>>> to say that this is the most important component in cTAKES, and >>>> probably the one that has generated the most q

RE: DeepPheno: guidance on CTakes

2014-06-27 Thread Finan, Sean
Hi Pei, Nice examples. The pipeline builder could be simpler (divvied), but they shouldn't leave anybody confused. +1 for the uimafit annotations! -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Friday, June 27, 2014 11:11 AM To: Hochheiser, Harry Stew

RE: Bacterium Dictionary

2014-06-30 Thread Finan, Sean
Hi Nick, There are ~26,000 T007 Bacterium (falls under Living Being) entries in UMLS 2013aa. They aren't in the cTakes dictionary, but you can build a separate bacteria dictionary using the dictionary creator tool in cTakes sandbox. It can create dictionaries formatted for use with both availa

RE: Bacterium Dictionary

2014-06-30 Thread Finan, Sean
as wondering how to utilize Ctakes to use that library. It will be > great if > there were some documents on building a separate dictionary using the > dictionary creator. > > > Thanks again, > Nick > > -Original Message- > From: Finan, Sean [mailto:sean.fi

RE: [VOTE] Release Apache cTAKES 3.2.0

2014-07-02 Thread Finan, Sean
+1 Pulled fresh candidate, built, and ran Clinical using CPE without problem. Other than that, no testing. SVN gave me a problem initially (checked out as anonymous) asking for a password then flunking the checkout, but an update completed it. I blame the heat. __

RE: [VOTE] Release Apache cTAKES 3.2.0 (rc2)

2014-07-10 Thread Finan, Sean
+1 for the ytex method of handling a umls login before download of the umls resources. While this also doesn't truly prevent people from sharing files (data) without a umls account, it is a little bit of a nicer mechanism. Aside ... Does anybody out there have experience with izpack? (izpack.

RE: Lucene for UMLS2014

2014-07-21 Thread Finan, Sean
Hi Harpreet, If you are willing to use cTakes 3.2, try the dictionary-lookup-fast module as a replacement of the default dictionary-lookup. That module has a new dictionary resource (hsql, not lucene) and slightly different methods for lookup and matching. In time trials it has been faster th

RE: question about sentence segmentation

2014-08-02 Thread Finan, Sean
Hi Tim, > It would be preferable to me to put sentence breaks in between the sections, > so > the first two sentences would be: > > 1) PE: Lymphonodes... > 2) Lungs: normal... The punctuation is (always) after the logical break, being "Term: " for a Term:Definition list. I think that the firs

RE: code value for vocabulary in dic-lookup-fast

2014-08-06 Thread Finan, Sean
Hi Harpreet, I don't know if this has yet been answered (I'm still finding vacation-time emails), but the Snomed-ct, Rx-norm, etc. codes were removed from the -fast dictionary for speed. Basically, any single UMLS Cui can have multiple different snomed-ct codes (for instance), and adding extra

RE: v_snomed_fword_lookup view

2014-08-08 Thread Finan, Sean
Hi Clayton, I don't know how the ytex dictionary lookup works, so I'm afraid that I can't help you with an answer. Maybe Vijay is the best person to do this. If you aren't tied to ytex you could try the new cTakes dictionary-lookup-fast. I tested "Patient came in with a malar rash" and it fo

RE: v_snomed_fword_lookup view

2014-08-11 Thread Finan, Sean
gt; > > > How exactly do you switch to using the cTakes dictionary-lookup-fast. > > Do I need to go in and alter xml files or is it as simple as adding a > > certain item to the list of analysis engines? > > > > > > On Fri, Aug 8, 2014 at 3:48 PM, Finan, Sean &

RE: v_snomed_fword_lookup view

2014-08-11 Thread Finan, Sean
.apache.uima.resource.ResourceInitializationException: > > Could > > >> not > > >> access the resource data at > > >> > > >> > > file:org\apache\ctakes\dictionary\lookup2\Snomed2011ab_ctakesTui\cTake > > sSnomed.xml > > >> > > &g

Youtube Channel "Apache cTakes"

2014-08-12 Thread Finan, Sean
cTakes now has a youtube channel named "Apache cTakes". It is empty, but if you have ever made a training video, presentation on a component (descriptors, type system, etc.), or demo of integration with another system (UimaFit, Uima-AS, etc.) then please feel free to post on that channel. When

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
the dbconsumer analysis engine from ytex (for storing into the > database with regard to analysis batch). > > Any tips for exporting or some simple issue I'm missing? > > Thanks, > Clayton > > > On Mon, Aug 11, 2014 at 2:09 PM, Harpreet Khanduja >

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
; If so, then I think I need to look into making/using one of those. > > > On Wed, Aug 13, 2014 at 1:41 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > > > Hi Clayton, > > > > I'm glad that you got it working. Though I stated that I would, I

RE: v_snomed_fword_lookup view

2014-08-13 Thread Finan, Sean
iar with that process. > > Tim > > > On 08/13/2014 02:11 PM, Clayton Turner wrote: > > Oh okay, so is the purpose of a CasConsumer to essentially save your > > data in a representation that you can do some kind of data mining or > > classification on it? If so, the

RE: Web server

2014-08-21 Thread Finan, Sean
Hi John, Have you (or another) thought about modifying the Uima Simple Server to run a cTakes pipeline? http://uima.apache.org/sandbox.html#simple-server > -Original Message- > From: John Green [mailto:john.travis.gr...@gmail.com] > Sent: Thursday, August 21, 2014 3:06 PM > To: dev@ctake

RE: Web server

2014-08-21 Thread Finan, Sean
le get requests with the xml ae for output built into > the > existing sandbox code, so I just wanted to hash that first before starting on > a > new thread. > > > > > Do you have experience with uima simple server? > > > > > JG > — >

RE: Permutations

2014-09-05 Thread Finan, Sean
Hi Kim, Pei, I don't think that I changed anything to which Kim is referring, just a couple of other things that happen to be in the same segment. From the attached it looks like Kim's change is to copy a list and sort the copy, while mine were moving the sort from an inner to outer loop. At

RE: Permutations

2014-09-05 Thread Finan, Sean
M To: dev@ctakes.apache.org; Finan, Sean Subject: Re: Permutations Hi Pei and Sean, Sean, any thoughts about this would be helpful. We also had issues in cTAKES 2.5. Here is the patch for 2.5. Before I got the patch to 3.0 Sean made his changes. === modified file 'src/edu/mayo/bmi/loo

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
Hi Nick, I think that the bottleneck is probably the lookup module itself. So, I just sent you a secure email/ftp link. It contains a build of the new dictionary-lookup-fast module. Should you choose to try it, let me know how things turn out. Sean F

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
recoreds Hi Sean, Many thanks, I will try it tomorrow. Do you have any special instruction to run that scrip or I have to use it with cTakes? Thanks, Nick -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:24 PM To: dev

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
, Nick Nikandish < snika...@emerginghealthit.com> wrote: > Great. I will do that. Thanks again. > > Nick > > -Original Message- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Tuesday, September 09, 2014 4:39 PM > To: dev@ctakes.apache.org &g

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
ndbox area? Sent from my iPhone > On Sep 9, 2014, at 5:24 PM, "Finan, Sean" > wrote: > > There is a tool to generate a dictionary in the new format using the UMLS > MR*** files. > > The module can also read directly from a file with bar-separated values: >

RE: Ctakes to process 5000K recoreds

2014-09-09 Thread Finan, Sean
reds (Trying to avoid passing individual jars via email) Sent from my iPhone > On Sep 9, 2014, at 5:26 PM, "Chen, Pei" > wrote: > > Sean- > Aren't the scripts to generate the DB already available in the sandbox area? > > Sent from my iPhone > >

RE: Ctakes to process 5000K records

2014-09-10 Thread Finan, Sean
--- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:39 PM To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000K recoreds Just use it with cTakes. Instead of removing other modules from the pipeline, replace the dictionary-lookup with dictionar

RE: Ctakes to process 5000K records

2014-09-10 Thread Finan, Sean
t file:org/apache/ctakes/dictionary/fast/cTakesHsql.xml. Where should I add it to the classpath? Thanks, Nick -Original Message----- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Tuesday, September 09, 2014 4:39 PM To: dev@ctakes.apache.org Subject: RE: Ctakes to process 5000

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
Steve Bethard wrote: > I spent some time writing a script for diff-ing CASes I urge anyone interested in comparing cTakes CASes / output to use this type of approach. Comparison of program output is a post-process task, and unless absolutely necessary code to juggle data and metadata belongs th

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
ch Corp http://www.perfectsearchcorp.com/ On 10/07/2014 07:30 AM, Finan, Sean wrote: > Steve Bethard wrote: >> I spent some time writing a script for diff-ing CASes > I urge anyone interested in comparing cTakes CASes / output to use this type > of approach. Comparison of program

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
ing with the same configuration outputs different data under different moons. Having consistent results helps us know if we've made improvements to our quality or not. c Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 08:50 AM, Finan, Sean wr

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
>> outputs different data under different moons. Having consistent >> results helps us know if we've made improvements to our quality or >> not. Having output that is in a predictable order makes checking to >> see if there are differences much cheaper when you a

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
gt; really interested in cTakes behaving well, so we are usually pretty > careful in testing our changes before committing anything. > > Thanks, > > Kim Ebert > 1.801.669.7342 > Perfect Search Corp > http://www.perfectsearchcorp.com/ > > On 10/07/2014 10:46 AM, Finan, Sea

RE: cTakes output predictability

2014-10-07 Thread Finan, Sean
disign decision, but it would be nice if we are consistently right, or consistently wrong. Many other instances of this result in similar issues. Kim Ebert 1.801.669.7342 Perfect Search Corp http://www.perfectsearchcorp.com/ On 10/07/2014 12:43 PM, Finan, Sean wrote: > I'm just about sapped on

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
Hi Bruce, I would venture to say that this is neither expected nor desired. Before you fix it (or in addition to a fix), try to run with the new dictionary lookup. It will have a different behavior, and it will be the default dictionary lookup in future releases of cTakes – making fixes to the

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
the necessary dictionary(ies) or how do I build them? [image: IMAT Solutions] <http://imatsolutions.com> Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean < sean.fi...@childrens.harvard.ed

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-08 Thread Finan, Sean
image: IMAT Solutions] <http://imatsolutions.com> Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 bruce.tiet...@imatsolutions.com On Wed, Oct 8, 2014 at 9:46 AM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Bruce, > > I would venture t

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-09 Thread Finan, Sean
lt;mailto:bruce.tiet...@imatsolutions.com> On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean mailto:sean.fi...@childrens.harvard.edu>> wrote: Hi Bruce, With Pei's help I just updated the sourceforge repo with the cTakes dictionaries. Checkout artifact ctakes-resources-snomed-rword-hsqldb-2

RE: Differences in MedicationMention annotations on subsequent processing runs

2014-10-09 Thread Finan, Sean
01.634.1547 bruce.tiet...@imatsolutions.com<mailto:bruce.tiet...@imatsolutions.com> On Wed, Oct 8, 2014 at 10:02 AM, Finan, Sean mailto:sean.fi...@childrens.harvard.edu>> wrote: Hi Bruce, With Pei's help I just updated the sourceforge repo with the cTakes dictionaries. Checko

RE: Need information regarding cTakes changes

2014-10-20 Thread Finan, Sean
Hi Chandu, For your note #2: > 2)Any new features that can be added to current version of cTakes > project to make it more useful. You can always check (or add to) the Jira "future enhancement" page at: https://issues.apache.org/jira/browse/CTAKES/fixforversion/12323040/?selectedTab=com.atlassian.

RE: ctakes-dictionary-lookup-fast

2014-11-07 Thread Finan, Sean
By Pei: > As much as I hate maintaining more desc xml's, but I think it's prudent to > create a separate one for a patch release temporarily for > ctakes-dictionary-lookup-fast so users do not get blindsided by the change in > output. By Sean: Excellent idea -Original Message- From: M

RE: Announcement: UMLS MedGen-MySQL dataset now available as open access download

2014-11-14 Thread Finan, Sean
Hi Andy, Great stuff! I think that I understand the method, but I have a question about the statement: >the content is publicly available per the NCBI policy and license for MedGen >sources Does this mean that I, Joe Anybody, could download the content, place some of the content in a databas

RE: Asking help for always unsuccessful AE load

2014-12-04 Thread Finan, Sean
Hi Jun, Do AE pipelines that do not use the Smoking Status module work? I think that Smoking Status configuration (via binary install) might be broken in the last several versions. I thought that I had submitted a Jira long, long ago, but right now I can't find it so maybe my memory is playing

RE: Scaling cTakes

2014-12-05 Thread Finan, Sean
Hi Brandon, It sounds like you've got a decent pipeline set up. To increase the speed you could try swapping out use of ctakes-dictionary-lookup with ctakes-dictionary-lookup-fast in the AE. Check ctakes-clinical-pipeline/desc/[ae]/AggregatePlaintextFastUMLSProcessor.xml for an example. As

RE: Scaling cTakes

2014-12-09 Thread Finan, Sean
Any other suggestions on performance tuning would be great! Thanks, Brandon -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Friday, December 05, 2014 1:14 PM To: dev@ctakes.apache.org Subject: RE: Scaling cTakes Hi Brandon, It sounds like you'

RE: Links Not Working

2014-12-12 Thread Finan, Sean
Hi Kasie, cTakes is a community effort, so you've contacted the right people. Assuming that the "Bug Tracker" link in the navigation bar on the left works, please submit a report and list all of the orphan links. A kindly volunteer will fix them as soon as possible. Thanks, Sean -Origin

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-15 Thread Finan, Sean
Hmmm, I can't find it in a search. However, here is a direct link: https://www.youtube.com/channel/UC8hQoOKz3v4PNEf6cqSkjbQ Maybe it needs a few videos to register in the search engine ? Sean -Original Message- From: Pei Chen [mailto:chen...@apache.org] Sent: Monday, December 15, 2014

RE: revamping the Apache cTAKES website

2014-12-15 Thread Finan, Sean
Wow, I've just spent the last 2 hours doing the exact same thing. That is what I get for missing a meeting. Mine is extremely similar, though slightly different language (and without the "improved performance" bar chart - which may not belong). I also put the "Examples" in a big green button

RE: revamping the Apache cTAKES website

2014-12-15 Thread Finan, Sean
Anyway, a pretty amazing fresh start, thanks Pei -Original Message- From: Chen, Pei [mailto:pei.c...@childrens.harvard.edu] Sent: Monday, December 15, 2014 4:33 PM To: dev@ctakes.apache.org Subject: RE: revamping the Apache cTAKES website Check out a mockup of a new website proposal: htt

RE: Problem running cTakes-clinical pipeline --> AggregatePlaintextFastUMLSProcessor.xml

2014-12-15 Thread Finan, Sean
Hi Yu, > Also do you know is there any command line I can run to annotate like a > thousand files automatically rather than copy and paster. You could try the CPE gui : bin/runctakesCPE.sh Sean From: Liang, Yu [mailto:yu.li...@nyumc.org] Sent: Monday, December 15, 2014 4:51 PM To: dev@ctakes.a

RE: UMLS Integration

2014-12-15 Thread Finan, Sean
Hi Praveen, I think that this question might be better aimed at the nlm umls community. The standard cTakes installation does not follow this workflow. Sean -Original Message- From: Jay_Ram [mailto:pandupraveen...@gmail.com] Sent: Tuesday, December 16, 2014 12:10 AM To: dev@ctakes.apa

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-16 Thread Finan, Sean
er. > > JG > > On Mon, Dec 15, 2014 at 11:43 AM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: >> >> Hmmm, I can't find it in a search. However, here is a direct link: >> >> https://www.youtube.com/channel/UC8hQoOKz3v4PNEf6cqSkjbQ

RE: intro video and ctakes youtube : Youtube Apache cTakes Channel Direct Link

2014-12-17 Thread Finan, Sean
video and ctakes youtube : Youtube Apache cTakes Channel Direct Link Isnt this to upload for my account? What about to the channel? On Tue, Dec 16, 2014 at 12:16 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > > Hi John, > > Look for an "Upload" button i

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Well, I guess that it is time for me to speak up … I must say that I’m happy that people are showing interest in the fast lookup. I am also happy (sort of) that some concerns are being raised – and that there is now community participation in my little toy. I have some concerns about what pe

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version then there WILL be differences in CUI and Semantic group. I don't have time to go into it with details, examples, etc. just be aware that every 6 months cu

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
com> Kim Ebert Software Engineer [Office:]801.669.7342 kim.eb...@imatsolutions.com<mailto:greg.hub...@imatsolutions.com> On 12/19/2014 11:31 AM, Finan, Sean wrote: One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
eer [Office:]801.669.7342 kim.eb...@imatsolutions.com<mailto:greg.hub...@imatsolutions.com> On 12/19/2014 11:31 AM, Finan, Sean wrote: One quick mention: The cTakes dictionaries are built with UMLS 2011AB. If the Human annotations were not done using the same UMLS version then there WILL

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Hi Bruce, I'm not sure how there would be fewer matches with the overlap processor. There should be all of the matches from the non-overlap processor plus those from the overlap. Decreasing from 215 to 211 is strange. Have you done any manual spot checks on this? It is really bizarre that y

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
Hi Bruce, > Correction -- So far, I did steps 1 and 2 of Sean's email. No problem. Aside from recreating the database, those two steps have the greatest impact. But before you change anything else, please do some manual spot checks. I have never seen a case where the lookup would be so horrib

RE: cTakes Annotation Comparison

2014-12-19 Thread Finan, Sean
uce Tietjen Senior Software Engineer [Image removed by sender. Mobile:]801.634.1547 bruce.tiet...@imatsolutions.com<mailto:bruce.tiet...@imatsolutions.com> On Fri, Dec 19, 2014 at 1:27 PM, Finan, Sean mailto:sean.fi...@childrens.harvard.edu>> wrote: Hi Bruce, I'm not sure how there

RE: cTakes Annotation Comparison --- (^:

2014-12-19 Thread Finan, Sean
Bruce Tietjen > Senior Software Engineer > [image: Mobile:] 801.634.1547 > bruce.tiet...@imatsolutions.com > > On Fri, Dec 19, 2014 at 1:39 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: >> >> Sorry, I meant “Do some spot checks on the validity”.

RE: Using cTakes programmatically

2014-12-29 Thread Finan, Sean
Hi Maite Meseure, Check the cTakes User guide on UMLS setup: https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+User+Install+Guide#cTAKES3.2UserInstallGuide-(Recommended)AddUMLSaccessrights which (in part) points you towards obtaining a license to use the NIH UMLS dictionary: https:

RE: Question about CPE/ descriptor and xml file.

2015-01-05 Thread Finan, Sean
Go through the error that you got, and look for a message like: Failed to initilize. Invalid UMLS License and Error: Invalid UMLS License. A UMLS License is required to use the UMLS dictionary lookup. Error: You may request one at: https://uts.nlm.nih.gov/license.html Please verify your UML

RE: Negex

2015-01-05 Thread Finan, Sean
I don't know. I'm comparing what I think is the 2009 negex trigger set https://code.google.com/p/negex/source/browse/trunk/GeneralNegEx.Java.v.1.2.05092009/negex_triggers.txt with the cTakes trigger set in org.apache.ctakes.core.fsm.machine.NegationFSM.java and it looks like the cTakes set is

RE: Negex

2015-01-05 Thread Finan, Sean
c\main\resources\org\apache\ctakes\ytex\negex\negex_triggers.txt) Adding triggers requires modifying a text file - much simpler than changing code and compiling. -vj On Mon, Jan 5, 2015 at 8:30 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > I don't know. I'm comparing

RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison

2015-01-09 Thread Finan, Sean
the shared clef 2013 task? https://sites.google.com/site/shareclefehealth/ I'm looking for something that doesn't have to be the best speed-wise, but that is the recommended for optimizing F1 measure. Regards, James -Original Message- From: Finan, Sean [mailto:sean.f

RE: dictionary lookup config for best F1 measure [was RE: cTakes Annotation Comparison : Span Overlap addendum

2015-01-09 Thread Finan, Sean
best F1 measure for something like the shared clef 2013 task? https://sites.google.com/site/shareclefehealth/ I'm looking for something that doesn't have to be the best speed-wise, but that is the recommended for optimizing F1 measure. Regards, James -Original Message- From: Fina

RE: Question about fast pipeline

2015-01-12 Thread Finan, Sean
Hi Michelle, Did your error have only "Could not find . as absolute" or did it also have "or in ... or in ..."? If you see " ... or in ... " then this is a new issue. If you don't, then you should update your source. If you need to run the release binary then let me know and I can work o

RE: Question about the pipeline

2015-02-02 Thread Finan, Sean
Hi Tol (and Maite), I'm not entirely certain that I understand the question, but here is an attempt to help. If I'm oversimplifying then I apologize. I think that ExampleAggregatePipeline is intended to represent a very simple single-note pipeline and that custom code could be produced by usin

RE: Question about the pipeline

2015-02-03 Thread Finan, Sean
job than the CPE-GUI -at least in Eclipse, I haven't managed to run it via the command line yet. On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Tol (and Maite), > > I'm not entirely certain that I understand the questio

RE: git mirrors out of sync?

2015-02-03 Thread Finan, Sean
Hi Steve, You are right (confirming your finding) - it looks like the first is a no-show and the second is somebody's personal upload to github (not git.apache.org) from 3 years ago. The jira claims that the item was closed (fixed), but if you go to https://urldefense.proofpoint.com/v2/url?u=

RE: Question about the pipeline

2015-02-03 Thread Finan, Sean
teJCas(); jCas.setDocumentText("some text"); AnalysisEngine tokenizer = createEngine(MyTokenizer.class); AnalysisEngine tagger = createEngine(MyTagger.class); runPipeline(jCas, tokenizer, tagger); for(Token token : iterate(jCas, Token.class)){ System.out.println(token.getTag()); } Tol O.

RE: Question about the pipeline

2015-02-05 Thread Finan, Sean
ctory. So in our HPC, it spawns a > > new job for each subfolder (which may have between 5 and 2500 notes). > > > > Todd Lingren > > Biomedical Informatics > > Cincinnati Children’s Hospital > > todd.ling...@cchmc.org > > 513-803-9032 > > > > > > ---

RE: Question about the pipeline

2015-02-05 Thread Finan, Sean
@ctakes.apache.org Subject: Re: Question about the pipeline Yes, it does but only in Eclipse, not in command line even though I am in the good directory. I have to look at the classpath more in details probably. Thanks for your replies. On Thu, Feb 5, 2015 at 8:08 AM, Finan, Sean < sean

RE: IntelliJ experience or instructions

2015-02-12 Thread Finan, Sean
Hi Taposh, Try the process outlined below. I have screenshots for each step if you want them. If this works (you are the first tester) then we can put it in the web documentation. Sean Fresh checkout from SVN === 1. Start IntelliJ IDEa. 2. In the "Quick Start" menu, selec

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-12 Thread Finan, Sean
Try something like the following for output: private int extractFeatures( final IdentifiedAnnotation annotation ) { // Extract the IdentifiedAnnotation itself final Collection umlsInfos = getUmlsInfos( annotation, _printSnomed ); if ( umlsInfos == null ) { return 0

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-12 Thread Finan, Sean
Oh yeah - use the -fast dictionary to get preferred text. The fastest way to get cuis only is with CuisOnlyPlaintextUMLSProcessor. If you want polarity make sure you uncomment the section with PolarityCleartkAnalysisEngine. Sean -Original Message- From: Maite Meseure Hugues [mailto:me

RE: BagOfCuisGenerator.java, same idea for getConceptText()

2015-02-17 Thread Finan, Sean
PM, Maite Meseure Hugues < meseure.ma...@gmail.com> wrote: > Thank you for your replies, It's helpful. I was working on 3.2.0 > version, so it looks like 3.2.1 allows to get the UMLS preferred text. > > Maite > > On Thu, Feb 12, 2015 at 2:25 PM, Finan, Sean < > sean.fi..

RE: CTAKES mirroring on github.

2015-02-17 Thread Finan, Sean
Our request is for a read-only mirror. However, if it ever becomes i/o, I don't know if this will have what you want, but http://git.apache.org/ Links to documentation (mostly server setup) http://www.apache.org/dev/git.html and a wiki (check toward middle and bottom for committer info) https:/

RE: Hello cTAKES Mailing List

2015-02-22 Thread Finan, Sean
Hi Raymond, If you use the dictionary-fast module there exists an entry "feeling bad" with cui 557911 and cui 231218. There is also "feel bad" and "feeling bad emotionally" You will find "horrible present pain" but no other entry with "horrible". You will not find any terms with "awful" and

RE: Hello cTAKES Mailing List

2015-02-23 Thread Finan, Sean
url?u=http-3A__www.nlm.nih.gov_research_umls_sourcereleasedocs_current_CHV_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=1Bkpeno1tqLjX78o0wYm5DmJHCHlK7hrxpeEgPnGtRM&s=-rEmTgTCe0mkSXT34XK56zkiuy_VxIfFvngGJzUwem8&e= On Sun, Feb 22, 2015 a

URGENT! RE: New Website

2015-02-25 Thread Finan, Sean
Hi all, It looks like a few people (myself included) are interested in having information on people, projects, papers, and applications that use cTAKES on the web page. I have created a form on google that might help us collect this and other information. Please visit https://docs.google.com

RE: Running cTakes in parallel

2015-02-25 Thread Finan, Sean
Hi Michelle, When it comes to > multiple instances of cTakes in parallel You can certainly start as many pipelines as you want as separate JVM processes, just make sure that you divide your notes among separate batches, one batch per process. Also keep in mind that you don't want to clobber yo

RE: Is it necessary to put UMLS login into files when passing them with -D to the JVM?

2015-03-06 Thread Finan, Sean
Hi Tom, > I am passing my UMLS login and password on startup as arguments ... > "-Dctakes.umlsuser=myusername -Dctakes.umlspw=mypassword" That is fine. If I understand correctly you are already running this way without problem. The comments in the .xml files should probably be extended to inc

RE: Questions about dictionary-lookup and dictionary-lookup-fast

2015-03-10 Thread Finan, Sean
Hi Maite, > Does anyone know why is it [UmlsDictionaryLookupAnnotator ]so slow? The top 5 reasons (1-3 are 90% of the problem): 1. The dictionary database is bloated with unwanted entries 2. The dictionary database indexing is sub-optimal 3. The second drug lookup with orangebook filtering take

  1   2   3   4   5   6   7   8   9   10   >