Re: YTEX cTAKES 3.1.1 ready
Completely non-contributory, but it is odd/humorous to see the headaches that quickly written notes we do in the 5 minutes post-encounter lead to in free-text analysis. JG On Thu, Feb 6, 2014 at 1:27 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Right, got it. I just wanted to let you know that some EMR notes -do- > require sentence splitting at newline characters. > > -Original Message- > From: vijay garla [mailto:vnga...@gmail.com] > Sent: Thursday, February 06, 2014 1:06 PM > To: dev@ctakes.apache.org > Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > vlad.valtchi...@gmail.com > Subject: Re: YTEX cTAKES 3.1.1 ready > > The cTAKES sentence detector is not changed in the YTEX branch. The YTEX > branch has an *additional* sentence detector that does not automatically > split sentences on newlines - users can use this if they like. > > -vj > > > On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > > > Hi Vijay, > > > > > I have yet to run across clinical text from a real EMR where > > > newlines > > represent the end of a sentence > > > > Since James pointed out this possibility a couple weeks ago, I have > > kept my eyes open. The problem is pretty ubiquitous in a corpus that > > I'm working with right now. I just opened the first note and gave it > > a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation) > endings. > > This is not including lists, which comprise about half of the note. > > One possible conjoinment was "Will consider [...] biopsy\nGiven [...]". > > Depending upon how cTakes deals with it, the meaning could change > > drastically. > > > > > I believe cTAKES absolutely has to support sentences with newlines > > within them > > > > Yes, cTakes should do so, but I hope that you aren't suggesting that > > it only support such a structure. > > > > Where is that easy button? > > > > -Original Message----- > > From: vijay garla [mailto:vnga...@gmail.com] > > Sent: Thursday, February 06, 2014 10:31 AM > > To: dev@ctakes.apache.org > > Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > > vlad.valtchi...@gmail.com > > Subject: Re: YTEX cTAKES 3.1.1 ready > > > > I believe it is worth migrating to trunk. > > > > Note that the sentence detector is also complementary - the existing > > ctakes sentence detector is unchanged - users can choose which > > sentence detector to use. There are changes to assertion & dependency > > parsing to support sentences without newlines, and that works with > > both sentence detectors. > > > > I believe cTAKES absolutely has to support sentences with newlines > > within them - I have yet to run across clinical text from a real EMR > > where newlines represent the end of a sentence - the changes to > > assertion & dependency parsing will have to be done at some point. > > > > -vj > > > > > > On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei > > wrote: > > > > > VJ, > > > Aside from the changes to the existing cTAKES code (sentence > > > detector, > > > etc.) [which we could leave out if it's still being debated], Do you > > > think it's worth migrating the ytex code to trunk at this point? > > > As you mentioned earlier, it's largely complementary. > > > [I was just thinking of saving effort to maintain the separate > > > branch and for simplicity for dev...] > > > > > > --Pei > > > > > > > -Original Message- > > > > From: vijay garla [mailto:vnga...@gmail.com] > > > > Sent: Wednesday, February 05, 2014 9:30 PM > > > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > > > > vlad.valtchi...@gmail.com > > > > Subject: Re: YTEX cTAKES 3.1.1 ready > > > > > > > > Hi Vlad, > > > > > > > > I Updated the umls install guide; see > > > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 > > > > > > > > I would prefer to add the docs in the ctakes confluence, but as > > > > far as I > > > can > > > > tell, I don't have write access there - can somebody give me write > > > privileges > > > > on the ctakes confluence site? > > > > > > > > There was a bug in the umls install; copy > > > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes- > > &
RE: YTEX cTAKES 3.1.1 ready
Right, got it. I just wanted to let you know that some EMR notes -do- require sentence splitting at newline characters. -Original Message- From: vijay garla [mailto:vnga...@gmail.com] Sent: Thursday, February 06, 2014 1:06 PM To: dev@ctakes.apache.org Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; vlad.valtchi...@gmail.com Subject: Re: YTEX cTAKES 3.1.1 ready The cTAKES sentence detector is not changed in the YTEX branch. The YTEX branch has an *additional* sentence detector that does not automatically split sentences on newlines - users can use this if they like. -vj On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Vijay, > > > I have yet to run across clinical text from a real EMR where > > newlines > represent the end of a sentence > > Since James pointed out this possibility a couple weeks ago, I have > kept my eyes open. The problem is pretty ubiquitous in a corpus that > I'm working with right now. I just opened the first note and gave it > a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation) > endings. > This is not including lists, which comprise about half of the note. > One possible conjoinment was "Will consider [...] biopsy\nGiven [...]". > Depending upon how cTakes deals with it, the meaning could change > drastically. > > > I believe cTAKES absolutely has to support sentences with newlines > within them > > Yes, cTakes should do so, but I hope that you aren't suggesting that > it only support such a structure. > > Where is that easy button? > > -Original Message- > From: vijay garla [mailto:vnga...@gmail.com] > Sent: Thursday, February 06, 2014 10:31 AM > To: dev@ctakes.apache.org > Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > vlad.valtchi...@gmail.com > Subject: Re: YTEX cTAKES 3.1.1 ready > > I believe it is worth migrating to trunk. > > Note that the sentence detector is also complementary - the existing > ctakes sentence detector is unchanged - users can choose which > sentence detector to use. There are changes to assertion & dependency > parsing to support sentences without newlines, and that works with > both sentence detectors. > > I believe cTAKES absolutely has to support sentences with newlines > within them - I have yet to run across clinical text from a real EMR > where newlines represent the end of a sentence - the changes to > assertion & dependency parsing will have to be done at some point. > > -vj > > > On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei > wrote: > > > VJ, > > Aside from the changes to the existing cTAKES code (sentence > > detector, > > etc.) [which we could leave out if it's still being debated], Do you > > think it's worth migrating the ytex code to trunk at this point? > > As you mentioned earlier, it's largely complementary. > > [I was just thinking of saving effort to maintain the separate > > branch and for simplicity for dev...] > > > > --Pei > > > > > -Original Message- > > > From: vijay garla [mailto:vnga...@gmail.com] > > > Sent: Wednesday, February 05, 2014 9:30 PM > > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > > > vlad.valtchi...@gmail.com > > > Subject: Re: YTEX cTAKES 3.1.1 ready > > > > > > Hi Vlad, > > > > > > I Updated the umls install guide; see > > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 > > > > > > I would prefer to add the docs in the ctakes confluence, but as > > > far as I > > can > > > tell, I don't have write access there - can somebody give me write > > privileges > > > on the ctakes confluence site? > > > > > > There was a bug in the umls install; copy > > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes- > > > ytex/scripts/data/build.xmlover > > > the corresponding file in your ctakes-3.1.2 install > > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. > > > The import is currently running on the UMLS 2013AA (I assume this > > > will > > complete > > > without issues as long as the umls schema hasn't changed from 2012). > > > > > > what trial and error did you have to go through to build the distro? > > > > > > -vj > > > > > > > > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > > > > > > > Hi Vlad, > > > > > > > > sorry that the instructions aren't c
Re: YTEX cTAKES 3.1.1 ready
The cTAKES sentence detector is not changed in the YTEX branch. The YTEX branch has an *additional* sentence detector that does not automatically split sentences on newlines - users can use this if they like. -vj On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > Hi Vijay, > > > I have yet to run across clinical text from a real EMR where newlines > represent the end of a sentence > > Since James pointed out this possibility a couple weeks ago, I have kept > my eyes open. The problem is pretty ubiquitous in a corpus that I'm > working with right now. I just opened the first note and gave it a count > ... 95 lines total, 9 are sentence/phrase (lacking punctuation) endings. > This is not including lists, which comprise about half of the note. > One possible conjoinment was "Will consider [...] biopsy\nGiven [...]". > Depending upon how cTakes deals with it, the meaning could change > drastically. > > > I believe cTAKES absolutely has to support sentences with newlines > within them > > Yes, cTakes should do so, but I hope that you aren't suggesting that it > only support such a structure. > > Where is that easy button? > > -Original Message- > From: vijay garla [mailto:vnga...@gmail.com] > Sent: Thursday, February 06, 2014 10:31 AM > To: dev@ctakes.apache.org > Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > vlad.valtchi...@gmail.com > Subject: Re: YTEX cTAKES 3.1.1 ready > > I believe it is worth migrating to trunk. > > Note that the sentence detector is also complementary - the existing > ctakes sentence detector is unchanged - users can choose which sentence > detector to use. There are changes to assertion & dependency parsing to > support sentences without newlines, and that works with both sentence > detectors. > > I believe cTAKES absolutely has to support sentences with newlines within > them - I have yet to run across clinical text from a real EMR where > newlines represent the end of a sentence - the changes to assertion & > dependency parsing will have to be done at some point. > > -vj > > > On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei > wrote: > > > VJ, > > Aside from the changes to the existing cTAKES code (sentence detector, > > etc.) [which we could leave out if it's still being debated], Do you > > think it's worth migrating the ytex code to trunk at this point? > > As you mentioned earlier, it's largely complementary. > > [I was just thinking of saving effort to maintain the separate branch > > and for simplicity for dev...] > > > > --Pei > > > > > -----Original Message- > > > From: vijay garla [mailto:vnga...@gmail.com] > > > Sent: Wednesday, February 05, 2014 9:30 PM > > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > > > vlad.valtchi...@gmail.com > > > Subject: Re: YTEX cTAKES 3.1.1 ready > > > > > > Hi Vlad, > > > > > > I Updated the umls install guide; see > > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 > > > > > > I would prefer to add the docs in the ctakes confluence, but as far > > > as I > > can > > > tell, I don't have write access there - can somebody give me write > > privileges > > > on the ctakes confluence site? > > > > > > There was a bug in the umls install; copy > > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes- > > > ytex/scripts/data/build.xmlover > > > the corresponding file in your ctakes-3.1.2 install > > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. > > > The import is currently running on the UMLS 2013AA (I assume this > > > will > > complete > > > without issues as long as the umls schema hasn't changed from 2012). > > > > > > what trial and error did you have to go through to build the distro? > > > > > > -vj > > > > > > > > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > > > > > > > Hi Vlad, > > > > > > > > sorry that the instructions aren't clear. > > > > > > > > re 1) What I am trying to say is install > > > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from > > > > 3.1.1). After that you still have to apply the lib and resources > > > > (these are things that cannot be distributed via apache). > > > > > > > > re 2) Yes, I need to update those docs. Hopefully will get to > > >
RE: YTEX cTAKES 3.1.1 ready
Hi Vijay, > I have yet to run across clinical text from a real EMR where newlines > represent the end of a sentence Since James pointed out this possibility a couple weeks ago, I have kept my eyes open. The problem is pretty ubiquitous in a corpus that I'm working with right now. I just opened the first note and gave it a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation) endings. This is not including lists, which comprise about half of the note. One possible conjoinment was "Will consider [...] biopsy\nGiven [...]". Depending upon how cTakes deals with it, the meaning could change drastically. > I believe cTAKES absolutely has to support sentences with newlines within them Yes, cTakes should do so, but I hope that you aren't suggesting that it only support such a structure. Where is that easy button? -Original Message- From: vijay garla [mailto:vnga...@gmail.com] Sent: Thursday, February 06, 2014 10:31 AM To: dev@ctakes.apache.org Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; vlad.valtchi...@gmail.com Subject: Re: YTEX cTAKES 3.1.1 ready I believe it is worth migrating to trunk. Note that the sentence detector is also complementary - the existing ctakes sentence detector is unchanged - users can choose which sentence detector to use. There are changes to assertion & dependency parsing to support sentences without newlines, and that works with both sentence detectors. I believe cTAKES absolutely has to support sentences with newlines within them - I have yet to run across clinical text from a real EMR where newlines represent the end of a sentence - the changes to assertion & dependency parsing will have to be done at some point. -vj On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei wrote: > VJ, > Aside from the changes to the existing cTAKES code (sentence detector, > etc.) [which we could leave out if it's still being debated], Do you > think it's worth migrating the ytex code to trunk at this point? > As you mentioned earlier, it's largely complementary. > [I was just thinking of saving effort to maintain the separate branch > and for simplicity for dev...] > > --Pei > > > -Original Message- > > From: vijay garla [mailto:vnga...@gmail.com] > > Sent: Wednesday, February 05, 2014 9:30 PM > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > > vlad.valtchi...@gmail.com > > Subject: Re: YTEX cTAKES 3.1.1 ready > > > > Hi Vlad, > > > > I Updated the umls install guide; see > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 > > > > I would prefer to add the docs in the ctakes confluence, but as far > > as I > can > > tell, I don't have write access there - can somebody give me write > privileges > > on the ctakes confluence site? > > > > There was a bug in the umls install; copy > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes- > > ytex/scripts/data/build.xmlover > > the corresponding file in your ctakes-3.1.2 install > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. > > The import is currently running on the UMLS 2013AA (I assume this > > will > complete > > without issues as long as the umls schema hasn't changed from 2012). > > > > what trial and error did you have to go through to build the distro? > > > > -vj > > > > > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > > > > > Hi Vlad, > > > > > > sorry that the instructions aren't clear. > > > > > > re 1) What I am trying to say is install > > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from > > > 3.1.1). After that you still have to apply the lib and resources > > > (these are things that cannot be distributed via apache). > > > > > > re 2) Yes, I need to update those docs. Hopefully will get to > > > that at some point. However, I assume you already have a UMLS DB > > > (also assume SQL Server). If you can't/don't want to use your > > > existing umls DB, please tell me. The I'll priortize upgrading > > > the doc on importing the umls tables (the scripts are there). > > > > > > best, > > > > > > VJ > > > > > > > > > On Wed, Feb 5, 2014 at 4:44 PM, wrote: > > > > > >> Hi VJ- > > >> > > >> so, with trial and error were able to make the distribution and > > >> now have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive. > > >> > > >> Here's what's unclear. > > >>
Re: YTEX cTAKES 3.1.1 ready
I believe it is worth migrating to trunk. Note that the sentence detector is also complementary - the existing ctakes sentence detector is unchanged - users can choose which sentence detector to use. There are changes to assertion & dependency parsing to support sentences without newlines, and that works with both sentence detectors. I believe cTAKES absolutely has to support sentences with newlines within them - I have yet to run across clinical text from a real EMR where newlines represent the end of a sentence - the changes to assertion & dependency parsing will have to be done at some point. -vj On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei wrote: > VJ, > Aside from the changes to the existing cTAKES code (sentence detector, > etc.) [which we could leave out if it's still being debated], > Do you think it's worth migrating the ytex code to trunk at this point? > As you mentioned earlier, it's largely complementary. > [I was just thinking of saving effort to maintain the separate branch and > for simplicity for dev...] > > --Pei > > > -Original Message- > > From: vijay garla [mailto:vnga...@gmail.com] > > Sent: Wednesday, February 05, 2014 9:30 PM > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > > vlad.valtchi...@gmail.com > > Subject: Re: YTEX cTAKES 3.1.1 ready > > > > Hi Vlad, > > > > I Updated the umls install guide; see > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 > > > > I would prefer to add the docs in the ctakes confluence, but as far as I > can > > tell, I don't have write access there - can somebody give me write > privileges > > on the ctakes confluence site? > > > > There was a bug in the umls install; copy > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes- > > ytex/scripts/data/build.xmlover > > the corresponding file in your ctakes-3.1.2 install > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. The > > import is currently running on the UMLS 2013AA (I assume this will > complete > > without issues as long as the umls schema hasn't changed from 2012). > > > > what trial and error did you have to go through to build the distro? > > > > -vj > > > > > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > > > > > Hi Vlad, > > > > > > sorry that the instructions aren't clear. > > > > > > re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot > > > as usual (this is unchanged from 3.1.1). After that you still have to > > > apply the lib and resources (these are things that cannot be > > > distributed via apache). > > > > > > re 2) Yes, I need to update those docs. Hopefully will get to that at > > > some point. However, I assume you already have a UMLS DB (also assume > > > SQL Server). If you can't/don't want to use your existing umls DB, > > > please tell me. The I'll priortize upgrading the doc on importing the > > > umls tables (the scripts are there). > > > > > > best, > > > > > > VJ > > > > > > > > > On Wed, Feb 5, 2014 at 4:44 PM, wrote: > > > > > >> Hi VJ- > > >> > > >> so, with trial and error were able to make the distribution and now > > >> have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive. > > >> > > >> Here's what's unclear. > > >> > > >> 1. Is now this the only (combined) thing that you need for ctakes > > >> 3.1.1 + Ytex? > > >> the current documentation (https://code.google.com/p/yte > > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal > > >> lation_cTAKES_3_1) > > >> which most probably is outdated, talks about installing cTakes 3.1.1 > > >> first and then applying 2 SNAPSHOT archives (downloadable) , lib and > > >> resources. > > >> This is a confusion point. > > >> > > >> 2. The directions to import UMLS subset are then outdated as well. > > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) to > > >> import the RRF files for the UMLS subset and then just use the > > >> resulting db. Thoughts? > > >> > > >> Thanks, > > >> Vlad Valtchinov > > >> Brigham Rad > > >> > > >> > > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote: > > >> > > >>> Hi Vlad, > > >>> > >
RE: YTEX cTAKES 3.1.1 ready
VJ, Aside from the changes to the existing cTAKES code (sentence detector, etc.) [which we could leave out if it's still being debated], Do you think it's worth migrating the ytex code to trunk at this point? As you mentioned earlier, it's largely complementary. [I was just thinking of saving effort to maintain the separate branch and for simplicity for dev...] --Pei > -Original Message- > From: vijay garla [mailto:vnga...@gmail.com] > Sent: Wednesday, February 05, 2014 9:30 PM > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; > vlad.valtchi...@gmail.com > Subject: Re: YTEX cTAKES 3.1.1 ready > > Hi Vlad, > > I Updated the umls install guide; see > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 > > I would prefer to add the docs in the ctakes confluence, but as far as I can > tell, I don't have write access there - can somebody give me write privileges > on the ctakes confluence site? > > There was a bug in the umls install; copy > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes- > ytex/scripts/data/build.xmlover > the corresponding file in your ctakes-3.1.2 install > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. The > import is currently running on the UMLS 2013AA (I assume this will complete > without issues as long as the umls schema hasn't changed from 2012). > > what trial and error did you have to go through to build the distro? > > -vj > > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > > > Hi Vlad, > > > > sorry that the instructions aren't clear. > > > > re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot > > as usual (this is unchanged from 3.1.1). After that you still have to > > apply the lib and resources (these are things that cannot be > > distributed via apache). > > > > re 2) Yes, I need to update those docs. Hopefully will get to that at > > some point. However, I assume you already have a UMLS DB (also assume > > SQL Server). If you can't/don't want to use your existing umls DB, > > please tell me. The I'll priortize upgrading the doc on importing the > > umls tables (the scripts are there). > > > > best, > > > > VJ > > > > > > On Wed, Feb 5, 2014 at 4:44 PM, wrote: > > > >> Hi VJ- > >> > >> so, with trial and error were able to make the distribution and now > >> have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive. > >> > >> Here's what's unclear. > >> > >> 1. Is now this the only (combined) thing that you need for ctakes > >> 3.1.1 + Ytex? > >> the current documentation (https://code.google.com/p/yte > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal > >> lation_cTAKES_3_1) > >> which most probably is outdated, talks about installing cTakes 3.1.1 > >> first and then applying 2 SNAPSHOT archives (downloadable) , lib and > >> resources. > >> This is a confusion point. > >> > >> 2. The directions to import UMLS subset are then outdated as well. > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) to > >> import the RRF files for the UMLS subset and then just use the > >> resulting db. Thoughts? > >> > >> Thanks, > >> Vlad Valtchinov > >> Brigham Rad > >> > >> > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote: > >> > >>> Hi Vlad, > >>> > >>> > >> All of ytex has been moved into ctakes, it is currently in a branch ( > >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex). You don't > >>> have to install ytex-0.8 - instead you will have to build and > >>> install from the ytex branch to create your own distribution. Steps 2 & 3 > are correct. > >>> > >>> Although it is a pain, if you have the jdk, maven, and svn, you can > >>> easily build your own distro: > >>> * open a command prompt > >>> * make sure jdk, maven, and svn are in your path > >>> * cd to some directory where you want to check stuff out (I like > >>> c:\temp) > >>> * run the following commands > >>> rmdir /s /q ctakes > >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes > >>> cd ctakes mvn clean install -DskipTests > >>> > >>> And you will have the ctakes (with ytex) distro in > >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SN
RE: YTEX cTAKES 3.1.1 ready
Hi Vijay, I gave you write access to the cTAKES space on the confluence site -- James -Original Message- From: vijay garla [mailto:vnga...@gmail.com] Sent: Wednesday, February 05, 2014 8:29 PM To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; vlad.valtchi...@gmail.com Subject: Re: YTEX cTAKES 3.1.1 ready Hi Vlad, I Updated the umls install guide; see https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 I would prefer to add the docs in the ctakes confluence, but as far as I can tell, I don't have write access there - can somebody give me write privileges on the ctakes confluence site? There was a bug in the umls install; copy https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-ytex/scripts/data/build.xmlover the corresponding file in your ctakes-3.1.2 install (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. The import is currently running on the UMLS 2013AA (I assume this will complete without issues as long as the umls schema hasn't changed from 2012). what trial and error did you have to go through to build the distro? -vj On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > Hi Vlad, > > sorry that the instructions aren't clear. > > re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot as > usual (this is unchanged from 3.1.1). After that you still have to apply > the lib and resources (these are things that cannot be distributed via > apache). > > re 2) Yes, I need to update those docs. Hopefully will get to that at > some point. However, I assume you already have a UMLS DB (also assume SQL > Server). If you can't/don't want to use your existing umls DB, please tell > me. The I'll priortize upgrading the doc on importing the umls tables (the > scripts are there). > > best, > > VJ > > > On Wed, Feb 5, 2014 at 4:44 PM, wrote: > >> Hi VJ- >> >> so, with trial and error were able to make the distribution and now have >> the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive. >> >> Here's what's unclear. >> >> 1. Is now this the only (combined) thing that you need for ctakes 3.1.1 + >> Ytex? >> the current documentation (https://code.google.com/p/yte >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal >> lation_cTAKES_3_1) >> which most probably is outdated, talks about installing cTakes 3.1.1 >> first and then applying 2 SNAPSHOT archives (downloadable) , lib and >> resources. >> This is a confusion point. >> >> 2. The directions to import UMLS subset are then outdated as well. Maybe >> one should use the old version (ctakes 2.5 and ytex 0.8) to >> import the RRF files for the UMLS subset and then just use the resulting >> db. Thoughts? >> >> Thanks, >> Vlad Valtchinov >> Brigham Rad >> >> >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote: >> >>> Hi Vlad, >>> >>> >> All of ytex has been moved into ctakes, it is currently in a branch ( >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex). You don't have >>> to install ytex-0.8 - instead you will have to build and install from the >>> ytex branch to create your own distribution. Steps 2 & 3 are correct. >>> >>> Although it is a pain, if you have the jdk, maven, and svn, you can >>> easily build your own distro: >>> * open a command prompt >>> * make sure jdk, maven, and svn are in your path >>> * cd to some directory where you want to check stuff out (I like c:\temp) >>> * run the following commands >>> rmdir /s /q ctakes >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes >>> cd ctakes >>> mvn clean install -DskipTests >>> >>> And you will have the ctakes (with ytex) distro in >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip >>> >>> What is the process for getting the ytex branch merged into trunk? As I >>> mentioned, there are very few changes to other ctakes classes/types - this >>> should be completely complementary and not affect any existing ctakes >>> functionality. >>> >>> -vj >>> >>> >>> >>> >>> >>> >>> On Thu, Jan 30, 2014 at 4:56 PM, wrote: >>> >>>> Hi VJ-- >>>> >>>> this is great!! Thanks for all the hard work on it! >>>> >>>> We're starting to look into the new install. For now we're trying the >>>> binaries out. >>>> >>>> There were these questions about the proper i
Re: YTEX cTAKES 3.1.1 ready
Hi Vlad, I Updated the umls install guide; see https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1 I would prefer to add the docs in the ctakes confluence, but as far as I can tell, I don't have write access there - can somebody give me write privileges on the ctakes confluence site? There was a bug in the umls install; copy https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-ytex/scripts/data/build.xmlover the corresponding file in your ctakes-3.1.2 install (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set. The import is currently running on the UMLS 2013AA (I assume this will complete without issues as long as the umls schema hasn't changed from 2012). what trial and error did you have to go through to build the distro? -vj On Wed, Feb 5, 2014 at 5:33 PM, vijay garla wrote: > Hi Vlad, > > sorry that the instructions aren't clear. > > re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot as > usual (this is unchanged from 3.1.1). After that you still have to apply > the lib and resources (these are things that cannot be distributed via > apache). > > re 2) Yes, I need to update those docs. Hopefully will get to that at > some point. However, I assume you already have a UMLS DB (also assume SQL > Server). If you can't/don't want to use your existing umls DB, please tell > me. The I'll priortize upgrading the doc on importing the umls tables (the > scripts are there). > > best, > > VJ > > > On Wed, Feb 5, 2014 at 4:44 PM, wrote: > >> Hi VJ- >> >> so, with trial and error were able to make the distribution and now have >> the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive. >> >> Here's what's unclear. >> >> 1. Is now this the only (combined) thing that you need for ctakes 3.1.1 + >> Ytex? >> the current documentation (https://code.google.com/p/yte >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal >> lation_cTAKES_3_1) >> which most probably is outdated, talks about installing cTakes 3.1.1 >> first and then applying 2 SNAPSHOT archives (downloadable) , lib and >> resources. >> This is a confusion point. >> >> 2. The directions to import UMLS subset are then outdated as well. Maybe >> one should use the old version (ctakes 2.5 and ytex 0.8) to >> import the RRF files for the UMLS subset and then just use the resulting >> db. Thoughts? >> >> Thanks, >> Vlad Valtchinov >> Brigham Rad >> >> >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote: >> >>> Hi Vlad, >>> >>> >> All of ytex has been moved into ctakes, it is currently in a branch ( >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex). You don't have >>> to install ytex-0.8 - instead you will have to build and install from the >>> ytex branch to create your own distribution. Steps 2 & 3 are correct. >>> >>> Although it is a pain, if you have the jdk, maven, and svn, you can >>> easily build your own distro: >>> * open a command prompt >>> * make sure jdk, maven, and svn are in your path >>> * cd to some directory where you want to check stuff out (I like c:\temp) >>> * run the following commands >>> rmdir /s /q ctakes >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes >>> cd ctakes >>> mvn clean install -DskipTests >>> >>> And you will have the ctakes (with ytex) distro in >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip >>> >>> What is the process for getting the ytex branch merged into trunk? As I >>> mentioned, there are very few changes to other ctakes classes/types - this >>> should be completely complementary and not affect any existing ctakes >>> functionality. >>> >>> -vj >>> >>> >>> >>> >>> >>> >>> On Thu, Jan 30, 2014 at 4:56 PM, wrote: >>> Hi VJ-- this is great!! Thanks for all the hard work on it! We're starting to look into the new install. For now we're trying the binaries out. There were these questions about the proper install steps: 1. Do we first install ytex-0.8 2. Then install the new cTakes 3.1.1 instance and also apply the SNAPSHOT lib and resources zips 3. Work our way to install the UMLS ontologies in the db Its is not entirely clear from the new document ( https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_ 1?ts=1388793998&updated=Installation_cTAKES_3_1) if there's still need to install ytex-0.8, or YTEX has been entirely merged into cTakes? If the last statement is correct, there are missing parts in i.e the UMLS install steps that are linked from the new ctakes 3.1.1 document. Thanks, vlad On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote: > > Hello All, > > I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. > Most of the YTEX functionality has been ported and integrated with > cTAKES, > and I've tested with MySQL and MS SQL Server (oracle tests pending). > > Most of the chang
Re: YTEX cTAKES 3.1.1 ready
Hi Vlad, sorry that the instructions aren't clear. re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot as usual (this is unchanged from 3.1.1). After that you still have to apply the lib and resources (these are things that cannot be distributed via apache). re 2) Yes, I need to update those docs. Hopefully will get to that at some point. However, I assume you already have a UMLS DB (also assume SQL Server). If you can't/don't want to use your existing umls DB, please tell me. The I'll priortize upgrading the doc on importing the umls tables (the scripts are there). best, VJ On Wed, Feb 5, 2014 at 4:44 PM, wrote: > Hi VJ- > > so, with trial and error were able to make the distribution and now have > the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive. > > Here's what's unclear. > > 1. Is now this the only (combined) thing that you need for ctakes 3.1.1 + > Ytex? > the current documentation (https://code.google.com/p/yte > x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal > lation_cTAKES_3_1) > which most probably is outdated, talks about installing cTakes 3.1.1 first > and then applying 2 SNAPSHOT archives (downloadable) , lib and resources. > This is a confusion point. > > 2. The directions to import UMLS subset are then outdated as well. Maybe > one should use the old version (ctakes 2.5 and ytex 0.8) to > import the RRF files for the UMLS subset and then just use the resulting > db. Thoughts? > > Thanks, > Vlad Valtchinov > Brigham Rad > > > On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote: > >> Hi Vlad, >> >> > All of ytex has been moved into ctakes, it is currently in a branch ( >> https://svn.apache.org/repos/asf/ctakes/branches/ytex). You don't have >> to install ytex-0.8 - instead you will have to build and install from the >> ytex branch to create your own distribution. Steps 2 & 3 are correct. >> >> Although it is a pain, if you have the jdk, maven, and svn, you can >> easily build your own distro: >> * open a command prompt >> * make sure jdk, maven, and svn are in your path >> * cd to some directory where you want to check stuff out (I like c:\temp) >> * run the following commands >> rmdir /s /q ctakes >> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes >> cd ctakes >> mvn clean install -DskipTests >> >> And you will have the ctakes (with ytex) distro in >> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip >> >> What is the process for getting the ytex branch merged into trunk? As I >> mentioned, there are very few changes to other ctakes classes/types - this >> should be completely complementary and not affect any existing ctakes >> functionality. >> >> -vj >> >> >> >> >> >> >> On Thu, Jan 30, 2014 at 4:56 PM, wrote: >> >>> Hi VJ-- >>> >>> this is great!! Thanks for all the hard work on it! >>> >>> We're starting to look into the new install. For now we're trying the >>> binaries out. >>> >>> There were these questions about the proper install steps: >>> >>> 1. Do we first install ytex-0.8 >>> 2. Then install the new cTakes 3.1.1 instance and also apply the >>> SNAPSHOT lib and resources zips >>> 3. Work our way to install the UMLS ontologies in the db >>> >>> Its is not entirely clear from the new document ( >>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_ >>> 1?ts=1388793998&updated=Installation_cTAKES_3_1) >>> if there's still need to install ytex-0.8, or YTEX has been entirely >>> merged into cTakes? >>> >>> If the last statement is correct, there are missing parts in i.e the >>> UMLS install steps that are linked from the new ctakes 3.1.1 document. >>> >>> Thanks, >>> vlad >>> >>> >>> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote: Hello All, I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. Most of the YTEX functionality has been ported and integrated with cTAKES, and I've tested with MySQL and MS SQL Server (oracle tests pending). Most of the changes were made in new projects - very little existing cTAKES code has been modified. The only non-trivial changes are in /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & SingleDocumentProcessorCtakes to deal with newlines within sentences correctly. Can somebody take a look at the changes in the ytex branch? I believe that the branch https://svn.apache.org/ repos/asf/ctakes/branches/ytex is ready to be merged into ctakes trunk, but would like other users to test it as well. Questions: * How can I distribute the ctakes binary distribution to ytex users before the merge? Can we make the branch build available somewhere? The binary distribution is too large to host on the ytex google code site (max 200 MB) * Non-ASF libraries - I have segregated these out into their own zip file t
Re: YTEX cTAKES 3.1.1 ready
Hi Vlad, All of ytex has been moved into ctakes, it is currently in a branch ( https://svn.apache.org/repos/asf/ctakes/branches/ytex). You don't have to install ytex-0.8 - instead you will have to build and install from the ytex branch to create your own distribution. Steps 2 & 3 are correct. Although it is a pain, if you have the jdk, maven, and svn, you can easily build your own distro: * open a command prompt * make sure jdk, maven, and svn are in your path * cd to some directory where you want to check stuff out (I like c:\temp) * run the following commands rmdir /s /q ctakes svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes cd ctakes mvn clean install -DskipTests And you will have the ctakes (with ytex) distro in ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip What is the process for getting the ytex branch merged into trunk? As I mentioned, there are very few changes to other ctakes classes/types - this should be completely complementary and not affect any existing ctakes functionality. -vj On Thu, Jan 30, 2014 at 4:56 PM, wrote: > Hi VJ-- > > this is great!! Thanks for all the hard work on it! > > We're starting to look into the new install. For now we're trying the > binaries out. > > There were these questions about the proper install steps: > > 1. Do we first install ytex-0.8 > 2. Then install the new cTakes 3.1.1 instance and also apply the SNAPSHOT > lib and resources zips > 3. Work our way to install the UMLS ontologies in the db > > Its is not entirely clear from the new document ( > https://code.google.com/p/ytex/wiki/Installation_cTAKES_ > 3_1?ts=1388793998&updated=Installation_cTAKES_3_1) > if there's still need to install ytex-0.8, or YTEX has been entirely > merged into cTakes? > > If the last statement is correct, there are missing parts in i.e the UMLS > install steps that are linked from the new ctakes 3.1.1 document. > > Thanks, > vlad > > > On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote: >> >> Hello All, >> >> I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. Most >> of the YTEX functionality has been ported and integrated with cTAKES, and >> I've tested with MySQL and MS SQL Server (oracle tests pending). >> >> Most of the changes were made in new projects - very little existing >> cTAKES code has been modified. The only non-trivial changes are >> in >> /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api >> - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & >> SingleDocumentProcessorCtakes to deal with newlines within sentences >> correctly. Can somebody take a look at the changes in the ytex branch? >> >> I believe that the branch https://svn.apache.org/ >> repos/asf/ctakes/branches/ytex is ready to be merged into ctakes trunk, >> but would like other users to test it as well. Questions: >> >> * How can I distribute the ctakes binary distribution to ytex users >> before the merge? Can we make the branch build available somewhere? The >> binary distribution is too large to host on the ytex google code site (max >> 200 MB) >> * Non-ASF libraries - I have segregated these out into their own zip file >> that can be distributed via sourceforge. As a stopgap, I can upload this >> to the ytex google code site, but would prefer to upload to sourceforge. >> * UMLS Derivatives - Ditto for these - would like to move to sourceforge. >> * Documentation - How can I update the confluence docs? I would migrate >> the documentation from the google code website. >> >> Here the installation instructions (putting the wagon in front of the >> horse ...) >> >> https://code.google.com/p/ytex/wiki/Installation_cTAKES_ >> 3_1?ts=1388793998&updated=Installation_cTAKES_3_1 >> >> Best, >> >> VJ >> >> >> -- > You received this message because you are subscribed to the Google Groups > "ytex-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to ytex-users+unsubscr...@googlegroups.com. > To post to this group, send email to ytex-us...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/ytex-users/70f03a80-ce1a-4c0e-b35d-5116d1c93ea0%40googlegroups.com > . > > For more options, visit https://groups.google.com/groups/opt_out. >
Re: YTEX cTAKES 3.1.1 ready
Hi VJ-- this is great!! Thanks for all the hard work on it! We're starting to look into the new install. For now we're trying the binaries out. There were these questions about the proper install steps: 1. Do we first install ytex-0.8 2. Then install the new cTakes 3.1.1 instance and also apply the SNAPSHOT lib and resources zips 3. Work our way to install the UMLS ontologies in the db Its is not entirely clear from the new document ( https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Installation_cTAKES_3_1 ) if there's still need to install ytex-0.8, or YTEX has been entirely merged into cTakes? If the last statement is correct, there are missing parts in i.e the UMLS install steps that are linked from the new ctakes 3.1.1 document. Thanks, vlad On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote: > > Hello All, > > I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. Most > of the YTEX functionality has been ported and integrated with cTAKES, and > I've tested with MySQL and MS SQL Server (oracle tests pending). > > Most of the changes were made in new projects - very little existing > cTAKES code has been modified. The only non-trivial changes are > in > /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api > - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & > SingleDocumentProcessorCtakes to deal with newlines within sentences > correctly. Can somebody take a look at the changes in the ytex branch? > > I believe that the branch > https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be > merged into ctakes trunk, but would like other users to test it as well. > Questions: > > * How can I distribute the ctakes binary distribution to ytex users before > the merge? Can we make the branch build available somewhere? The binary > distribution is too large to host on the ytex google code site (max 200 MB) > * Non-ASF libraries - I have segregated these out into their own zip file > that can be distributed via sourceforge. As a stopgap, I can upload this > to the ytex google code site, but would prefer to upload to sourceforge. > * UMLS Derivatives - Ditto for these - would like to move to sourceforge. > * Documentation - How can I update the confluence docs? I would migrate > the documentation from the google code website. > > Here the installation instructions (putting the wagon in front of the > horse ...) > > > https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Installation_cTAKES_3_1 > > Best, > > VJ > > >
Re: YTEX cTAKES 3.1.1 ready
Hi Bhaskar, Thanks for working on this! I am not sure what is going wrong, but can you try doing a clean checkout and a "mvn clean install" from the command line? When I do this, all tests in all projects pass, no changes necessary. I have never had the eclipse maven m2e plugin work flawlessly (congratulations to those of you who do); if the command line mvn clean install works, then I need to figure out why the build doesn't work from eclipse. I am using the 64-bit eclipse kepler, jdk 1.7, and maven 3.1.0; for me none of the projects with jcasgen plugins compile from eclipse. Regarding the class not found exceptions: these classes are in CTAKES_HOME/lib/ctakes-core-3.1.2-SNAPSHOT.jar - can you make sure the classes are there? If not, something went wrong in the build of ctakes-core (again please verify that that this works when you run maven from the command line). I run this batch script which does a checkout, install, ytex setup, and runs the ytex CPE in a single go: @REM c:\java\setenv.bat - puts java, maven, and svn in the PATH @REM c:\temp\ctakes-build - where ctakes gets checked out @REM c:\temp - where I downloaded the ctakes resources, ytex resources & lib files @REM c:\java\apache-ctakes-3.1.2-SNAPSHOT - where ctakes get's installed call c:\java\setenv.bat cd C:\temp\ctakes-build\ctakes rmdir /s /q ctakes svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes cd ctakes @rem need to unset ctakes home set CTAKES_HOME= call mvn clean install cd c:\java rmdir /s /q apache-ctakes-3.1.2-SNAPSHOT jar xf C:\temp\ctakes-build\ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip cd apache-ctakes-3.1.2-SNAPSHOT jar xf c:\temp\ctakes-resources-3.1.0.zip jar xf c:\temp\ctakes-ytex-resources-3.1.2-SNAPSHOT.zip jar xf c:\temp\ctakes-ytex-lib-3.1.2-SNAPSHOT.zip @rem stop here if you don't need to do a ytex setup @rem adjust this to match your environment- to use a different DB, copy a different ytex.properties file copy resources\org\apache\ctakes\ytex\ytex.properties.mssql.example resources\org\apache\ctakes\ytex\ytex.properties cd bin\ctakes-ytex\scripts call ..\..\ant.bat -f build-setup.xml all > setup.out 2>&1 cd ..\..\.. call bin\setenv.bat java -cp "%CLASSPATH%" -Dlog4j.configuration=file:/%CTAKES_HOME%/config/log4j.xml -Xms512M -Xmx2g org.apache.ctakes.ytex.tools.RunCPE desc\ctakes-ytex-uima\desc\cpe\fracture_demo.xml Best, Vijay On Wed, Jan 8, 2014 at 7:41 AM, Bhaskar B wrote: > Hi Vijay, > > Thank you for this update. In order to evaluate the YTEX port into > cTAKES, I wanted to do the following (goals): > > (a) Compile the ctakes/branches/ytex to create > apache-ctakes-3.1.2-SNAPSHOT. > (b) Validate that the just compiled binary works by running either the > AggregatePlaintextProcessor.xml or AggregatePlaintextUMLSProcessor.xml > pipelines (basically following instructions at > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide > ). > (c) Follow instructions at > https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1 to explore > YTEX specific features, e.g. pipelines, writing annotations to database, > etc. > > So I took the following steps: > > 1) Followed the instructions at > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+Developer+Install+Guideand > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guideto > successfully (i) configure Eclipse (Juno) environment for building > cTAKES, and (ii) pull the source code from the SVN repository branch: > https://svn.apache.org/repos/asf/ctakes/branches/ytex. > > 2) In Eclipse, right clicked on the top-most (or root level) pom.xml, > selected Run As -> Maven build, typed in "compile" as the goal, and hit > Run. This successfully compiled all the projects. > > 3) In Eclipse, repeated step (2) but selected Run As -> Maven install to > create a distribution. However this is where I started to encounter few > problems. I was eventually able to get Maven install to complete and > create the binaries (i.e. in ctakes-distribution/target/) but by manually > doing the following: > > 3.1) in pom.xml of ctakes-ytex-uima: excluded all tests > 3.2) in pom.xml of ctakes-core: excluded 2 tests > 3.3) in ctakes-ytex: modified scripts/build-classpath.xml and > scripts/build-setup.xml to hardcode path to ANT library > 3.4) in ctakes-dependency-parser: excluded 1 unit test > > 4) After this step, I extracted apache-ctakes-3.1.2-SNAPSHOT-bin.zip and > attempted to verify (i.e. step (b)) above. However when I attempted to > load AggregatePlaintextProcessor, I am getting exception (below). > > While I continue to look to resolve this, any tips/hints that you could > provide to get this build functional would be highly appreciated. I think > I may be missing one or more key steps/configuration. My workstation is > Windows 7 and I use Eclipse Juno. > > - > java.lang.Error: Unresolved compilation problems: > The
Re: YTEX cTAKES 3.1.1 ready
Hi Vijay, Thank you for this update. In order to evaluate the YTEX port into cTAKES, I wanted to do the following (goals): (a) Compile the ctakes/branches/ytex to create apache-ctakes-3.1.2-SNAPSHOT. (b) Validate that the just compiled binary works by running either the AggregatePlaintextProcessor.xml or AggregatePlaintextUMLSProcessor.xml pipelines (basically following instructions at https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide). (c) Follow instructions at https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1 to explore YTEX specific features, e.g. pipelines, writing annotations to database, etc. So I took the following steps: 1) Followed the instructions at https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+Developer+Install+Guide and https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guide to successfully (i) configure Eclipse (Juno) environment for building cTAKES, and (ii) pull the source code from the SVN repository branch: https://svn.apache.org/repos/asf/ctakes/branches/ytex. 2) In Eclipse, right clicked on the top-most (or root level) pom.xml, selected Run As -> Maven build, typed in "compile" as the goal, and hit Run. This successfully compiled all the projects. 3) In Eclipse, repeated step (2) but selected Run As -> Maven install to create a distribution. However this is where I started to encounter few problems. I was eventually able to get Maven install to complete and create the binaries (i.e. in ctakes-distribution/target/) but by manually doing the following: 3.1) in pom.xml of ctakes-ytex-uima: excluded all tests 3.2) in pom.xml of ctakes-core: excluded 2 tests 3.3) in ctakes-ytex: modified scripts/build-classpath.xml and scripts/build-setup.xml to hardcode path to ANT library 3.4) in ctakes-dependency-parser: excluded 1 unit test 4) After this step, I extracted apache-ctakes-3.1.2-SNAPSHOT-bin.zip and attempted to verify (i.e. step (b)) above. However when I attempted to load AggregatePlaintextProcessor, I am getting exception (below). While I continue to look to resolve this, any tips/hints that you could provide to get this build functional would be highly appreciated. I think I may be missing one or more key steps/configuration. My workstation is Windows 7 and I use Eclipse Juno. - java.lang.Error: Unresolved compilation problems: The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.machine cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.output cannot be resolved The import org.apache.ctakes.core.fsm.token.BaseToken cannot be resolved The import org.apache.ctakes.core.fsm.token.EolToken cannot be resolved DateFSM cannot be resolved to a type TimeFSM cannot be resolved to a type FractionFSM cannot be resolved to a type RomanNumeralFSM cannot be resolved to a type RangeFSM cannot be resolved to a type MeasurementFSM cannot be resolved to a type PersonTitleFSM cannot be resolved to a type DateFSM cannot be resolved to a type DateFSM cannot be resolved to a type TimeFSM cannot be resolved to a type TimeFSM cannot be resolved to a type FractionFSM cannot be resolved to a type FractionFSM cannot be resolved to a type RomanNumeralFSM cannot be resolved to a type RomanNumeralFSM cannot be resolved to a type RangeFSM cannot be resolved to a type RangeFSM cannot be resolved to a type MeasurementFSM cannot be resolved to a type MeasurementFSM cannot be resolved to a type PersonTitleFSM cannot be resolved to a type PersonTitleFSM cannot be resolved to a type BaseToken cannot be resolved to a type BaseToken cannot be resolved to a type BaseToken cannot be resolved to a type The method adaptToBaseToken(BaseToken) from the type ContextDependentTok enizerAnnotator refers to the missing type BaseToken EolToken cannot be resolved to a type BaseToken cannot be re
Re: YTEX cTAKES 3.1.1 ready
see answers inline On Tue, Jan 7, 2014 at 10:35 AM, Chen, Pei wrote: > > * How can I distribute the ctakes binary distribution to ytex users > before the > > merge? Can we make the branch build available somewhere? The binary > > distribution is too large to host on the ytex google code site (max 200 > MB) > Is this for testing purposes? Or official release? If it's just for > testing, there will be more options... > Are you referring to the convenience binary/zip file? Or maven artifacts > that could be deployed to the SNAPSHOTS repo [1]? > If it's for testing, you can always have users build from source via mvn > package (assuming you added the ytex* to the ctakes-distribution module)? > Again if it's for testing, you can always try the svn or home dir. But > it's not the recommended channel for actual distribution to users because > that normally has to go through the normal release process (Voting, etc.). > This is for testing. Ytex has been added to the ctakes distro > > > * Non-ASF libraries - I have segregated these out into their own zip > file that > > can be distributed via sourceforge. As a stopgap, I can upload this to > the ytex > > google code site, but would prefer to upload to sourceforge. > Are these optional 3rd party libs available via maven central? > Most of them are. The only exception is the MS SQL Driver, which is freely redistributable (see http://msdn.microsoft.com/en-us/sqlserver/aa937725). I did not find anything similar for the oracle jdbc driver so I left that out (users will have to download that separately). The zip is here: https://ytex.googlecode.com/files/ctakes-ytex-lib-3.1.2-SNAPSHOT.zip > > > * UMLS Derivatives - Ditto for these - would like to move to sourceforge. > Are you planning to distribute them via maven central? I think it would > be nice to make these available as maven artifacts. > If so, what is your sourceforge id? We can grant you access to the > existing ctakes resourcse project [2]: > The pom.xml is already setup to upload to OSS Sonatype (request a login > for oss sonatype to perform a mvn deploy for the actual upload later on)... > I have placed the umls resources behind a server that requires UTS authentication (note that this obviates the need for supplying umls username and password in ctakes config files/scripts). The umls resources are here: http://www.ytex-nlp.org/umls.download/secure/3.1/ctakes-ytex-resources-3.1.2-SNAPSHOT.zip This is a plain old apache http server with the module for CAS (the other CAS) authentication. If ctakes has an apache server somewhere, we could do the same. > > > * Documentation - How can I update the confluence docs? I would migrate > > the documentation from the google code website. > This would be great; You've been added to the cTAKES confluence space [3]. > > Downloading the code now... To be continued... > > [1] > https://repository.apache.org/content/groups/snapshots/org/apache/ctakes/ > [2] http://sourceforge.net/p/ctakesresources/code/HEAD/tree/trunk/ > [3] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES > > > -Original Message- > > From: vijay garla [mailto:vnga...@gmail.com] > > Sent: Friday, January 03, 2014 10:23 PM > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org > > Subject: YTEX cTAKES 3.1.1 ready > > > > Hello All, > > > > I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. > Most of the > > YTEX functionality has been ported and integrated with cTAKES, and I've > > tested with MySQL and MS SQL Server (oracle tests pending). > > > > Most of the changes were made in new projects - very little existing > cTAKES > > code has been modified. The only non-trivial changes are in /ctakes- > > assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api > > - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & > > SingleDocumentProcessorCtakes to deal with newlines within sentences > > correctly. Can somebody take a look at the changes in the ytex branch? > > > > I believe that the branch > > https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be > > merged into ctakes trunk, but would like other users to test it as well. > > Questions: > > > > * How can I distribute the ctakes binary distribution to ytex users > before the > > merge? Can we make the branch build available somewhere? The binary > > distribution is too large to host on the ytex google code site (max 200 > MB) > > * Non-ASF libraries - I have segregated these out into their own zip > file that > > can be distributed via sourceforge. As a stopgap, I can upload this to > the ytex > > google code site, but would prefer to upload to sourceforge. > > * UMLS Derivatives - Ditto for these - would like to move to sourceforge. > > * Documentation - How can I update the confluence docs? I would migrate > > the documentation from the google code website. > > > > Here the installation instructions (putting the wagon in front of the > hor
RE: YTEX cTAKES 3.1.1 ready
> * How can I distribute the ctakes binary distribution to ytex users before the > merge? Can we make the branch build available somewhere? The binary > distribution is too large to host on the ytex google code site (max 200 MB) Is this for testing purposes? Or official release? If it's just for testing, there will be more options... Are you referring to the convenience binary/zip file? Or maven artifacts that could be deployed to the SNAPSHOTS repo [1]? If it's for testing, you can always have users build from source via mvn package (assuming you added the ytex* to the ctakes-distribution module)? Again if it's for testing, you can always try the svn or home dir. But it's not the recommended channel for actual distribution to users because that normally has to go through the normal release process (Voting, etc.). > * Non-ASF libraries - I have segregated these out into their own zip file that > can be distributed via sourceforge. As a stopgap, I can upload this to the > ytex > google code site, but would prefer to upload to sourceforge. Are these optional 3rd party libs available via maven central? > * UMLS Derivatives - Ditto for these - would like to move to sourceforge. Are you planning to distribute them via maven central? I think it would be nice to make these available as maven artifacts. If so, what is your sourceforge id? We can grant you access to the existing ctakes resourcse project [2]: The pom.xml is already setup to upload to OSS Sonatype (request a login for oss sonatype to perform a mvn deploy for the actual upload later on)... > * Documentation - How can I update the confluence docs? I would migrate > the documentation from the google code website. This would be great; You've been added to the cTAKES confluence space [3]. Downloading the code now... To be continued... [1] https://repository.apache.org/content/groups/snapshots/org/apache/ctakes/ [2] http://sourceforge.net/p/ctakesresources/code/HEAD/tree/trunk/ [3] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES > -Original Message- > From: vijay garla [mailto:vnga...@gmail.com] > Sent: Friday, January 03, 2014 10:23 PM > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org > Subject: YTEX cTAKES 3.1.1 ready > > Hello All, > > I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. Most of > the > YTEX functionality has been ported and integrated with cTAKES, and I've > tested with MySQL and MS SQL Server (oracle tests pending). > > Most of the changes were made in new projects - very little existing cTAKES > code has been modified. The only non-trivial changes are in /ctakes- > assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api > - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & > SingleDocumentProcessorCtakes to deal with newlines within sentences > correctly. Can somebody take a look at the changes in the ytex branch? > > I believe that the branch > https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be > merged into ctakes trunk, but would like other users to test it as well. > Questions: > > * How can I distribute the ctakes binary distribution to ytex users before the > merge? Can we make the branch build available somewhere? The binary > distribution is too large to host on the ytex google code site (max 200 MB) > * Non-ASF libraries - I have segregated these out into their own zip file that > can be distributed via sourceforge. As a stopgap, I can upload this to the > ytex > google code site, but would prefer to upload to sourceforge. > * UMLS Derivatives - Ditto for these - would like to move to sourceforge. > * Documentation - How can I update the confluence docs? I would migrate > the documentation from the google code website. > > Here the installation instructions (putting the wagon in front of the horse > ...) > > https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=13887939 > 98&updated=Installation_cTAKES_3_1 > > Best, > > VJ
Re: YTEX cTAKES 3.1.1 ready
This is awesome VJ. I'll take a look at it this week unless someone beats me to it > On Jan 3, 2014, at 10:22 PM, "vijay garla" wrote: > > Hello All, > > I have finished an initial cut at the port of YTEX to cTAKES 3.1.1. Most > of the YTEX functionality has been ported and integrated with cTAKES, and > I've tested with MySQL and MS SQL Server (oracle tests pending). > > Most of the changes were made in new projects - very little existing cTAKES > code has been modified. The only non-trivial changes are > in > /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api > - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & > SingleDocumentProcessorCtakes to deal with newlines within sentences > correctly. Can somebody take a look at the changes in the ytex branch? > > I believe that the branch > https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be merged > into ctakes trunk, but would like other users to test it as well. > Questions: > > * How can I distribute the ctakes binary distribution to ytex users before > the merge? Can we make the branch build available somewhere? The binary > distribution is too large to host on the ytex google code site (max 200 MB) > * Non-ASF libraries - I have segregated these out into their own zip file > that can be distributed via sourceforge. As a stopgap, I can upload this > to the ytex google code site, but would prefer to upload to sourceforge. > * UMLS Derivatives - Ditto for these - would like to move to sourceforge. > * Documentation - How can I update the confluence docs? I would migrate > the documentation from the google code website. > > Here the installation instructions (putting the wagon in front of the horse > ...) > > https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Installation_cTAKES_3_1 > > Best, > > VJ