Re: YTEX cTAKES 3.1.1 ready

2014-02-07 Thread John Green
Completely non-contributory, but it is odd/humorous to see the headaches
that quickly written notes we do in the 5 minutes post-encounter lead to in
free-text analysis.

JG


On Thu, Feb 6, 2014 at 1:27 PM, Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Right, got it.  I just wanted to let you know that some EMR notes -do-
> require sentence splitting at newline characters.
>
> -Original Message-
> From: vijay garla [mailto:vnga...@gmail.com]
> Sent: Thursday, February 06, 2014 1:06 PM
> To: dev@ctakes.apache.org
> Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> vlad.valtchi...@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
>
> The cTAKES sentence detector is not changed in the YTEX branch.  The YTEX
> branch has an *additional* sentence detector that does not automatically
> split sentences on newlines - users can use this if they like.
>
> -vj
>
>
> On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean <
> sean.fi...@childrens.harvard.edu> wrote:
>
> > Hi Vijay,
> >
> > >  I have yet to run across clinical text from a real EMR where
> > > newlines
> > represent the end of a sentence
> >
> > Since James pointed out this possibility a couple weeks ago, I have
> > kept my eyes open.  The problem is pretty ubiquitous in a corpus that
> > I'm working with right now.  I just opened the first note and gave it
> > a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation)
> endings.
> >  This is not including lists, which comprise about half of the note.
> > One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".
> >  Depending upon how cTakes deals with it, the meaning could change
> > drastically.
> >
> > > I believe cTAKES absolutely has to support sentences with newlines
> > within them
> >
> > Yes, cTakes should do so, but I hope that you aren't suggesting that
> > it only support such a structure.
> >
> > Where is that easy button?
> >
> > -Original Message-----
> > From: vijay garla [mailto:vnga...@gmail.com]
> > Sent: Thursday, February 06, 2014 10:31 AM
> > To: dev@ctakes.apache.org
> > Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> > vlad.valtchi...@gmail.com
> > Subject: Re: YTEX cTAKES 3.1.1 ready
> >
> > I believe it is worth migrating to trunk.
> >
> > Note that the sentence detector is also complementary - the existing
> > ctakes sentence detector is unchanged - users can choose which
> > sentence detector to use.  There are changes to assertion & dependency
> > parsing to support sentences without newlines, and that works with
> > both sentence detectors.
> >
> > I believe cTAKES absolutely has to support sentences with newlines
> > within them - I have yet to run across clinical text from a real EMR
> > where newlines represent the end of a sentence - the changes to
> > assertion & dependency parsing will have to be done at some point.
> >
> > -vj
> >
> >
> > On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
> > wrote:
> >
> > > VJ,
> > > Aside from the changes to the existing cTAKES code (sentence
> > > detector,
> > > etc.) [which we could leave out if it's still being debated], Do you
> > > think it's worth migrating the ytex code to trunk at this point?
> > >  As you mentioned earlier, it's largely complementary.
> > > [I was just thinking of saving effort to maintain the separate
> > > branch and for simplicity for dev...]
> > >
> > > --Pei
> > >
> > > > -Original Message-
> > > > From: vijay garla [mailto:vnga...@gmail.com]
> > > > Sent: Wednesday, February 05, 2014 9:30 PM
> > > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> > > > vlad.valtchi...@gmail.com
> > > > Subject: Re: YTEX cTAKES 3.1.1 ready
> > > >
> > > > Hi Vlad,
> > > >
> > > > I Updated the umls install guide; see
> > > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> > > >
> > > > I would prefer to add the docs in the ctakes confluence, but as
> > > > far as I
> > > can
> > > > tell, I don't have write access there - can somebody give me write
> > > privileges
> > > > on the ctakes confluence site?
> > > >
> > > > There was a bug in the umls install; copy
> > > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > &

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Finan, Sean
Right, got it.  I just wanted to let you know that some EMR notes -do- require 
sentence splitting at newline characters.

-Original Message-
From: vijay garla [mailto:vnga...@gmail.com] 
Sent: Thursday, February 06, 2014 1:06 PM
To: dev@ctakes.apache.org
Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; 
vlad.valtchi...@gmail.com
Subject: Re: YTEX cTAKES 3.1.1 ready

The cTAKES sentence detector is not changed in the YTEX branch.  The YTEX 
branch has an *additional* sentence detector that does not automatically split 
sentences on newlines - users can use this if they like.

-vj


On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean < sean.fi...@childrens.harvard.edu> 
wrote:

> Hi Vijay,
>
> >  I have yet to run across clinical text from a real EMR where 
> > newlines
> represent the end of a sentence
>
> Since James pointed out this possibility a couple weeks ago, I have 
> kept my eyes open.  The problem is pretty ubiquitous in a corpus that 
> I'm working with right now.  I just opened the first note and gave it 
> a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation) 
> endings.
>  This is not including lists, which comprise about half of the note.
> One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".
>  Depending upon how cTakes deals with it, the meaning could change 
> drastically.
>
> > I believe cTAKES absolutely has to support sentences with newlines
> within them
>
> Yes, cTakes should do so, but I hope that you aren't suggesting that 
> it only support such a structure.
>
> Where is that easy button?
>
> -Original Message-
> From: vijay garla [mailto:vnga...@gmail.com]
> Sent: Thursday, February 06, 2014 10:31 AM
> To: dev@ctakes.apache.org
> Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; 
> vlad.valtchi...@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
>
> I believe it is worth migrating to trunk.
>
> Note that the sentence detector is also complementary - the existing 
> ctakes sentence detector is unchanged - users can choose which 
> sentence detector to use.  There are changes to assertion & dependency 
> parsing to support sentences without newlines, and that works with 
> both sentence detectors.
>
> I believe cTAKES absolutely has to support sentences with newlines 
> within them - I have yet to run across clinical text from a real EMR 
> where newlines represent the end of a sentence - the changes to 
> assertion & dependency parsing will have to be done at some point.
>
> -vj
>
>
> On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
> wrote:
>
> > VJ,
> > Aside from the changes to the existing cTAKES code (sentence 
> > detector,
> > etc.) [which we could leave out if it's still being debated], Do you 
> > think it's worth migrating the ytex code to trunk at this point?
> >  As you mentioned earlier, it's largely complementary.
> > [I was just thinking of saving effort to maintain the separate 
> > branch and for simplicity for dev...]
> >
> > --Pei
> >
> > > -Original Message-
> > > From: vijay garla [mailto:vnga...@gmail.com]
> > > Sent: Wednesday, February 05, 2014 9:30 PM
> > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; 
> > > vlad.valtchi...@gmail.com
> > > Subject: Re: YTEX cTAKES 3.1.1 ready
> > >
> > > Hi Vlad,
> > >
> > > I Updated the umls install guide; see
> > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> > >
> > > I would prefer to add the docs in the ctakes confluence, but as 
> > > far as I
> > can
> > > tell, I don't have write access there - can somebody give me write
> > privileges
> > > on the ctakes confluence site?
> > >
> > > There was a bug in the umls install; copy
> > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > > ytex/scripts/data/build.xmlover
> > > the corresponding file in your ctakes-3.1.2 install
> > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.
> > > The import is currently running on the UMLS 2013AA (I assume this 
> > > will
> > complete
> > > without issues as long as the umls schema hasn't changed from 2012).
> > >
> > > what trial and error did you have to go through to build the distro?
> > >
> > > -vj
> > >
> > >
> > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:
> > >
> > > > Hi Vlad,
> > > >
> > > > sorry that the instructions aren't c

Re: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread vijay garla
The cTAKES sentence detector is not changed in the YTEX branch.  The YTEX
branch has an *additional* sentence detector that does not automatically
split sentences on newlines - users can use this if they like.

-vj


On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Hi Vijay,
>
> >  I have yet to run across clinical text from a real EMR where newlines
> represent the end of a sentence
>
> Since James pointed out this possibility a couple weeks ago, I have kept
> my eyes open.  The problem is pretty ubiquitous in a corpus that I'm
> working with right now.  I just opened the first note and gave it a count
> ... 95 lines total, 9 are sentence/phrase (lacking punctuation) endings.
>  This is not including lists, which comprise about half of the note.
> One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".
>  Depending upon how cTakes deals with it, the meaning could change
> drastically.
>
> > I believe cTAKES absolutely has to support sentences with newlines
> within them
>
> Yes, cTakes should do so, but I hope that you aren't suggesting that it
> only support such a structure.
>
> Where is that easy button?
>
> -Original Message-
> From: vijay garla [mailto:vnga...@gmail.com]
> Sent: Thursday, February 06, 2014 10:31 AM
> To: dev@ctakes.apache.org
> Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> vlad.valtchi...@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
>
> I believe it is worth migrating to trunk.
>
> Note that the sentence detector is also complementary - the existing
> ctakes sentence detector is unchanged - users can choose which sentence
> detector to use.  There are changes to assertion & dependency parsing to
> support sentences without newlines, and that works with both sentence
> detectors.
>
> I believe cTAKES absolutely has to support sentences with newlines within
> them - I have yet to run across clinical text from a real EMR where
> newlines represent the end of a sentence - the changes to assertion &
> dependency parsing will have to be done at some point.
>
> -vj
>
>
> On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
> wrote:
>
> > VJ,
> > Aside from the changes to the existing cTAKES code (sentence detector,
> > etc.) [which we could leave out if it's still being debated], Do you
> > think it's worth migrating the ytex code to trunk at this point?
> >  As you mentioned earlier, it's largely complementary.
> > [I was just thinking of saving effort to maintain the separate branch
> > and for simplicity for dev...]
> >
> > --Pei
> >
> > > -----Original Message-
> > > From: vijay garla [mailto:vnga...@gmail.com]
> > > Sent: Wednesday, February 05, 2014 9:30 PM
> > > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> > > vlad.valtchi...@gmail.com
> > > Subject: Re: YTEX cTAKES 3.1.1 ready
> > >
> > > Hi Vlad,
> > >
> > > I Updated the umls install guide; see
> > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> > >
> > > I would prefer to add the docs in the ctakes confluence, but as far
> > > as I
> > can
> > > tell, I don't have write access there - can somebody give me write
> > privileges
> > > on the ctakes confluence site?
> > >
> > > There was a bug in the umls install; copy
> > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > > ytex/scripts/data/build.xmlover
> > > the corresponding file in your ctakes-3.1.2 install
> > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.
> > > The import is currently running on the UMLS 2013AA (I assume this
> > > will
> > complete
> > > without issues as long as the umls schema hasn't changed from 2012).
> > >
> > > what trial and error did you have to go through to build the distro?
> > >
> > > -vj
> > >
> > >
> > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:
> > >
> > > > Hi Vlad,
> > > >
> > > > sorry that the instructions aren't clear.
> > > >
> > > > re 1) What I am trying to say is install
> > > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from
> > > > 3.1.1).  After that you still have to apply the lib and resources
> > > > (these are things that cannot be distributed via apache).
> > > >
> > > > re 2) Yes, I need to update those docs.  Hopefully will get to
> > >

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Finan, Sean
Hi Vijay, 

>  I have yet to run across clinical text from a real EMR where newlines 
> represent the end of a sentence

Since James pointed out this possibility a couple weeks ago, I have kept my 
eyes open.  The problem is pretty ubiquitous in a corpus that I'm working with 
right now.  I just opened the first note and gave it a count ... 95 lines 
total, 9 are sentence/phrase (lacking punctuation) endings.  This is not 
including lists, which comprise about half of the note.
One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".  
Depending upon how cTakes deals with it, the meaning could change drastically.

> I believe cTAKES absolutely has to support sentences with newlines within them

Yes, cTakes should do so, but I hope that you aren't suggesting that it only 
support such a structure.

Where is that easy button?

-Original Message-
From: vijay garla [mailto:vnga...@gmail.com] 
Sent: Thursday, February 06, 2014 10:31 AM
To: dev@ctakes.apache.org
Cc: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; 
vlad.valtchi...@gmail.com
Subject: Re: YTEX cTAKES 3.1.1 ready

I believe it is worth migrating to trunk.

Note that the sentence detector is also complementary - the existing ctakes 
sentence detector is unchanged - users can choose which sentence detector to 
use.  There are changes to assertion & dependency parsing to support sentences 
without newlines, and that works with both sentence detectors.

I believe cTAKES absolutely has to support sentences with newlines within them 
- I have yet to run across clinical text from a real EMR where newlines 
represent the end of a sentence - the changes to assertion & dependency parsing 
will have to be done at some point.

-vj


On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
wrote:

> VJ,
> Aside from the changes to the existing cTAKES code (sentence detector,
> etc.) [which we could leave out if it's still being debated], Do you 
> think it's worth migrating the ytex code to trunk at this point?
>  As you mentioned earlier, it's largely complementary.
> [I was just thinking of saving effort to maintain the separate branch 
> and for simplicity for dev...]
>
> --Pei
>
> > -Original Message-
> > From: vijay garla [mailto:vnga...@gmail.com]
> > Sent: Wednesday, February 05, 2014 9:30 PM
> > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; 
> > vlad.valtchi...@gmail.com
> > Subject: Re: YTEX cTAKES 3.1.1 ready
> >
> > Hi Vlad,
> >
> > I Updated the umls install guide; see
> > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> >
> > I would prefer to add the docs in the ctakes confluence, but as far 
> > as I
> can
> > tell, I don't have write access there - can somebody give me write
> privileges
> > on the ctakes confluence site?
> >
> > There was a bug in the umls install; copy
> > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > ytex/scripts/data/build.xmlover
> > the corresponding file in your ctakes-3.1.2 install
> > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.  
> > The import is currently running on the UMLS 2013AA (I assume this 
> > will
> complete
> > without issues as long as the umls schema hasn't changed from 2012).
> >
> > what trial and error did you have to go through to build the distro?
> >
> > -vj
> >
> >
> > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:
> >
> > > Hi Vlad,
> > >
> > > sorry that the instructions aren't clear.
> > >
> > > re 1) What I am trying to say is install 
> > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from 
> > > 3.1.1).  After that you still have to apply the lib and resources 
> > > (these are things that cannot be distributed via apache).
> > >
> > > re 2) Yes, I need to update those docs.  Hopefully will get to 
> > > that at some point.  However, I assume you already have a UMLS DB 
> > > (also assume SQL Server).  If you can't/don't want to use your 
> > > existing umls DB, please tell me.  The I'll priortize upgrading 
> > > the doc on importing the umls tables (the scripts are there).
> > >
> > > best,
> > >
> > > VJ
> > >
> > >
> > > On Wed, Feb 5, 2014 at 4:44 PM,  wrote:
> > >
> > >> Hi VJ-
> > >>
> > >> so, with trial and error were able to make the distribution and 
> > >> now have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> > >>
> > >> Here's what's unclear.
> > >>

Re: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread vijay garla
I believe it is worth migrating to trunk.

Note that the sentence detector is also complementary - the existing ctakes
sentence detector is unchanged - users can choose which sentence detector
to use.  There are changes to assertion & dependency parsing to support
sentences without newlines, and that works with both sentence detectors.

I believe cTAKES absolutely has to support sentences with newlines within
them - I have yet to run across clinical text from a real EMR where
newlines represent the end of a sentence - the changes to assertion &
dependency parsing will have to be done at some point.

-vj


On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
wrote:

> VJ,
> Aside from the changes to the existing cTAKES code (sentence detector,
> etc.) [which we could leave out if it's still being debated],
> Do you think it's worth migrating the ytex code to trunk at this point?
>  As you mentioned earlier, it's largely complementary.
> [I was just thinking of saving effort to maintain the separate branch and
> for simplicity for dev...]
>
> --Pei
>
> > -Original Message-
> > From: vijay garla [mailto:vnga...@gmail.com]
> > Sent: Wednesday, February 05, 2014 9:30 PM
> > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> > vlad.valtchi...@gmail.com
> > Subject: Re: YTEX cTAKES 3.1.1 ready
> >
> > Hi Vlad,
> >
> > I Updated the umls install guide; see
> > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> >
> > I would prefer to add the docs in the ctakes confluence, but as far as I
> can
> > tell, I don't have write access there - can somebody give me write
> privileges
> > on the ctakes confluence site?
> >
> > There was a bug in the umls install; copy
> > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > ytex/scripts/data/build.xmlover
> > the corresponding file in your ctakes-3.1.2 install
> > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.  The
> > import is currently running on the UMLS 2013AA (I assume this will
> complete
> > without issues as long as the umls schema hasn't changed from 2012).
> >
> > what trial and error did you have to go through to build the distro?
> >
> > -vj
> >
> >
> > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:
> >
> > > Hi Vlad,
> > >
> > > sorry that the instructions aren't clear.
> > >
> > > re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot
> > > as usual (this is unchanged from 3.1.1).  After that you still have to
> > > apply the lib and resources (these are things that cannot be
> > > distributed via apache).
> > >
> > > re 2) Yes, I need to update those docs.  Hopefully will get to that at
> > > some point.  However, I assume you already have a UMLS DB (also assume
> > > SQL Server).  If you can't/don't want to use your existing umls DB,
> > > please tell me.  The I'll priortize upgrading the doc on importing the
> > > umls tables (the scripts are there).
> > >
> > > best,
> > >
> > > VJ
> > >
> > >
> > > On Wed, Feb 5, 2014 at 4:44 PM,  wrote:
> > >
> > >> Hi VJ-
> > >>
> > >> so, with trial and error were able to make the distribution and now
> > >> have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> > >>
> > >> Here's what's unclear.
> > >>
> > >> 1. Is now this the only (combined) thing that you need for ctakes
> > >> 3.1.1 + Ytex?
> > >> the current documentation (https://code.google.com/p/yte
> > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> > >> lation_cTAKES_3_1)
> > >> which most probably is outdated, talks about installing cTakes 3.1.1
> > >> first and then applying 2 SNAPSHOT archives (downloadable) , lib and
> > >> resources.
> > >> This is a confusion point.
> > >>
> > >> 2. The directions to import UMLS subset are then outdated as well.
> > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) to
> > >> import the RRF files for the UMLS subset and then just use the
> > >> resulting db. Thoughts?
> > >>
> > >> Thanks,
> > >> Vlad Valtchinov
> > >> Brigham Rad
> > >>
> > >>
> > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
> > >>
> > >>> Hi Vlad,
> > >>>
> >

RE: YTEX cTAKES 3.1.1 ready

2014-02-06 Thread Chen, Pei
VJ,
Aside from the changes to the existing cTAKES code (sentence detector, etc.) 
[which we could leave out if it's still being debated], 
Do you think it's worth migrating the ytex code to trunk at this point?  As you 
mentioned earlier, it's largely complementary.
[I was just thinking of saving effort to maintain the separate branch and for 
simplicity for dev...]

--Pei

> -Original Message-
> From: vijay garla [mailto:vnga...@gmail.com]
> Sent: Wednesday, February 05, 2014 9:30 PM
> To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org;
> vlad.valtchi...@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
> 
> Hi Vlad,
> 
> I Updated the umls install guide; see
> https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> 
> I would prefer to add the docs in the ctakes confluence, but as far as I can
> tell, I don't have write access there - can somebody give me write privileges
> on the ctakes confluence site?
> 
> There was a bug in the umls install; copy
> https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> ytex/scripts/data/build.xmlover
> the corresponding file in your ctakes-3.1.2 install
> (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.  The
> import is currently running on the UMLS 2013AA (I assume this will complete
> without issues as long as the umls schema hasn't changed from 2012).
> 
> what trial and error did you have to go through to build the distro?
> 
> -vj
> 
> 
> On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:
> 
> > Hi Vlad,
> >
> > sorry that the instructions aren't clear.
> >
> > re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot
> > as usual (this is unchanged from 3.1.1).  After that you still have to
> > apply the lib and resources (these are things that cannot be
> > distributed via apache).
> >
> > re 2) Yes, I need to update those docs.  Hopefully will get to that at
> > some point.  However, I assume you already have a UMLS DB (also assume
> > SQL Server).  If you can't/don't want to use your existing umls DB,
> > please tell me.  The I'll priortize upgrading the doc on importing the
> > umls tables (the scripts are there).
> >
> > best,
> >
> > VJ
> >
> >
> > On Wed, Feb 5, 2014 at 4:44 PM,  wrote:
> >
> >> Hi VJ-
> >>
> >> so, with trial and error were able to make the distribution and now
> >> have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> >>
> >> Here's what's unclear.
> >>
> >> 1. Is now this the only (combined) thing that you need for ctakes
> >> 3.1.1 + Ytex?
> >> the current documentation (https://code.google.com/p/yte
> >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> >> lation_cTAKES_3_1)
> >> which most probably is outdated, talks about installing cTakes 3.1.1
> >> first and then applying 2 SNAPSHOT archives (downloadable) , lib and
> >> resources.
> >> This is a confusion point.
> >>
> >> 2. The directions to import UMLS subset are then outdated as well.
> >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) to
> >> import the RRF files for the UMLS subset and then just use the
> >> resulting db. Thoughts?
> >>
> >> Thanks,
> >> Vlad Valtchinov
> >> Brigham Rad
> >>
> >>
> >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
> >>
> >>> Hi Vlad,
> >>>
> >>>
> >> All of ytex has been moved into ctakes, it is currently in a branch (
> >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You don't
> >>> have to install ytex-0.8 - instead you will have to build and
> >>> install from the ytex branch to create your own distribution.  Steps 2 & 3
> are correct.
> >>>
> >>> Although it is a pain, if you have the jdk, maven, and svn, you can
> >>> easily build your own distro:
> >>> * open a command prompt
> >>> * make sure jdk, maven, and svn are in your path
> >>> * cd to some directory where you want to check stuff out (I like
> >>> c:\temp)
> >>> * run the following commands
> >>> rmdir /s /q ctakes
> >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes
> >>> cd ctakes mvn clean install -DskipTests
> >>>
> >>> And you will have the ctakes (with ytex) distro in
> >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SN

RE: YTEX cTAKES 3.1.1 ready

2014-02-05 Thread Masanz, James J.
Hi Vijay,
I gave you write access to the cTAKES space on the confluence site

-- James

-Original Message-
From: vijay garla [mailto:vnga...@gmail.com] 
Sent: Wednesday, February 05, 2014 8:29 PM
To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org; 
vlad.valtchi...@gmail.com
Subject: Re: YTEX cTAKES 3.1.1 ready

Hi Vlad,

I Updated the umls install guide; see
https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1

I would prefer to add the docs in the ctakes confluence, but as far as I
can tell, I don't have write access there - can somebody give me write
privileges on the ctakes confluence site?

There was a bug in the umls install; copy
https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-ytex/scripts/data/build.xmlover
the corresponding file in your ctakes-3.1.2 install
(CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.  The
import is currently running on the UMLS 2013AA (I assume this will complete
without issues as long as the umls schema hasn't changed from 2012).

what trial and error did you have to go through to build the distro?

-vj


On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:

> Hi Vlad,
>
> sorry that the instructions aren't clear.
>
> re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot as
> usual (this is unchanged from 3.1.1).  After that you still have to apply
> the lib and resources (these are things that cannot be distributed via
> apache).
>
> re 2) Yes, I need to update those docs.  Hopefully will get to that at
> some point.  However, I assume you already have a UMLS DB (also assume SQL
> Server).  If you can't/don't want to use your existing umls DB, please tell
> me.  The I'll priortize upgrading the doc on importing the umls tables (the
> scripts are there).
>
> best,
>
> VJ
>
>
> On Wed, Feb 5, 2014 at 4:44 PM,  wrote:
>
>> Hi VJ-
>>
>> so, with trial and error were able to make the distribution and now have
>> the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
>>
>> Here's what's unclear.
>>
>> 1. Is now this the only (combined) thing that you need for ctakes 3.1.1 +
>> Ytex?
>> the current documentation (https://code.google.com/p/yte
>> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
>> lation_cTAKES_3_1)
>> which most probably is outdated, talks about installing cTakes 3.1.1
>> first and then applying 2 SNAPSHOT archives (downloadable) , lib and
>> resources.
>> This is a confusion point.
>>
>> 2. The directions to import UMLS subset are then outdated as well. Maybe
>> one should use the old version (ctakes 2.5 and ytex 0.8) to
>> import the RRF files for the UMLS subset and then just use the resulting
>> db. Thoughts?
>>
>> Thanks,
>> Vlad Valtchinov
>> Brigham Rad
>>
>>
>> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
>>
>>> Hi Vlad,
>>>
>>>
>> All of ytex has been moved into ctakes, it is currently in a branch (
>>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You don't have
>>> to install ytex-0.8 - instead you will have to build and install from the
>>> ytex branch to create your own distribution.  Steps 2 & 3 are correct.
>>>
>>> Although it is a pain, if you have the jdk, maven, and svn, you can
>>> easily build your own distro:
>>> * open a command prompt
>>> * make sure jdk, maven, and svn are in your path
>>> * cd to some directory where you want to check stuff out (I like c:\temp)
>>> * run the following commands
>>> rmdir /s /q ctakes
>>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes
>>> cd ctakes
>>> mvn clean install -DskipTests
>>>
>>> And you will have the ctakes (with ytex) distro in
>>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip
>>>
>>> What is the process for getting the ytex branch merged into trunk?  As I
>>> mentioned, there are very few changes to other ctakes classes/types - this
>>> should be completely complementary and not affect any existing ctakes
>>> functionality.
>>>
>>> -vj
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jan 30, 2014 at 4:56 PM,  wrote:
>>>
>>>> Hi VJ--
>>>>
>>>> this is great!! Thanks for all the hard work on it!
>>>>
>>>> We're starting to look into the new install. For now we're trying the
>>>> binaries out.
>>>>
>>>> There were these questions about the proper i

Re: YTEX cTAKES 3.1.1 ready

2014-02-05 Thread vijay garla
Hi Vlad,

I Updated the umls install guide; see
https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1

I would prefer to add the docs in the ctakes confluence, but as far as I
can tell, I don't have write access there - can somebody give me write
privileges on the ctakes confluence site?

There was a bug in the umls install; copy
https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-ytex/scripts/data/build.xmlover
the corresponding file in your ctakes-3.1.2 install
(CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.  The
import is currently running on the UMLS 2013AA (I assume this will complete
without issues as long as the umls schema hasn't changed from 2012).

what trial and error did you have to go through to build the distro?

-vj


On Wed, Feb 5, 2014 at 5:33 PM, vijay garla  wrote:

> Hi Vlad,
>
> sorry that the instructions aren't clear.
>
> re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot as
> usual (this is unchanged from 3.1.1).  After that you still have to apply
> the lib and resources (these are things that cannot be distributed via
> apache).
>
> re 2) Yes, I need to update those docs.  Hopefully will get to that at
> some point.  However, I assume you already have a UMLS DB (also assume SQL
> Server).  If you can't/don't want to use your existing umls DB, please tell
> me.  The I'll priortize upgrading the doc on importing the umls tables (the
> scripts are there).
>
> best,
>
> VJ
>
>
> On Wed, Feb 5, 2014 at 4:44 PM,  wrote:
>
>> Hi VJ-
>>
>> so, with trial and error were able to make the distribution and now have
>> the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
>>
>> Here's what's unclear.
>>
>> 1. Is now this the only (combined) thing that you need for ctakes 3.1.1 +
>> Ytex?
>> the current documentation (https://code.google.com/p/yte
>> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
>> lation_cTAKES_3_1)
>> which most probably is outdated, talks about installing cTakes 3.1.1
>> first and then applying 2 SNAPSHOT archives (downloadable) , lib and
>> resources.
>> This is a confusion point.
>>
>> 2. The directions to import UMLS subset are then outdated as well. Maybe
>> one should use the old version (ctakes 2.5 and ytex 0.8) to
>> import the RRF files for the UMLS subset and then just use the resulting
>> db. Thoughts?
>>
>> Thanks,
>> Vlad Valtchinov
>> Brigham Rad
>>
>>
>> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
>>
>>> Hi Vlad,
>>>
>>>
>> All of ytex has been moved into ctakes, it is currently in a branch (
>>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You don't have
>>> to install ytex-0.8 - instead you will have to build and install from the
>>> ytex branch to create your own distribution.  Steps 2 & 3 are correct.
>>>
>>> Although it is a pain, if you have the jdk, maven, and svn, you can
>>> easily build your own distro:
>>> * open a command prompt
>>> * make sure jdk, maven, and svn are in your path
>>> * cd to some directory where you want to check stuff out (I like c:\temp)
>>> * run the following commands
>>> rmdir /s /q ctakes
>>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes
>>> cd ctakes
>>> mvn clean install -DskipTests
>>>
>>> And you will have the ctakes (with ytex) distro in
>>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip
>>>
>>> What is the process for getting the ytex branch merged into trunk?  As I
>>> mentioned, there are very few changes to other ctakes classes/types - this
>>> should be completely complementary and not affect any existing ctakes
>>> functionality.
>>>
>>> -vj
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jan 30, 2014 at 4:56 PM,  wrote:
>>>
 Hi VJ--

 this is great!! Thanks for all the hard work on it!

 We're starting to look into the new install. For now we're trying the
 binaries out.

 There were these questions about the proper install steps:

 1. Do we first install ytex-0.8
 2. Then install the new cTakes 3.1.1 instance and also apply the
 SNAPSHOT lib and resources zips
 3. Work our way to install the UMLS ontologies in the db

 Its is not entirely clear from the new document (
 https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_
 1?ts=1388793998&updated=Installation_cTAKES_3_1)
 if there's still need to install ytex-0.8, or YTEX has been entirely
 merged into cTakes?

 If the last statement is correct, there are missing parts in i.e the
 UMLS install steps that are linked from the new ctakes 3.1.1 document.

 Thanks,
 vlad


 On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote:
>
> Hello All,
>
> I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.
>  Most of the YTEX functionality has been ported and integrated with 
> cTAKES,
> and I've tested with MySQL and MS SQL Server (oracle tests pending).
>
> Most of the chang

Re: YTEX cTAKES 3.1.1 ready

2014-02-05 Thread vijay garla
Hi Vlad,

sorry that the instructions aren't clear.

re 1) What I am trying to say is install apache-ctakes-3.2.0-snapshot as
usual (this is unchanged from 3.1.1).  After that you still have to apply
the lib and resources (these are things that cannot be distributed via
apache).

re 2) Yes, I need to update those docs.  Hopefully will get to that at some
point.  However, I assume you already have a UMLS DB (also assume SQL
Server).  If you can't/don't want to use your existing umls DB, please tell
me.  The I'll priortize upgrading the doc on importing the umls tables (the
scripts are there).

best,

VJ


On Wed, Feb 5, 2014 at 4:44 PM,  wrote:

> Hi VJ-
>
> so, with trial and error were able to make the distribution and now have
> the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
>
> Here's what's unclear.
>
> 1. Is now this the only (combined) thing that you need for ctakes 3.1.1 +
> Ytex?
> the current documentation (https://code.google.com/p/yte
> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> lation_cTAKES_3_1)
> which most probably is outdated, talks about installing cTakes 3.1.1 first
> and then applying 2 SNAPSHOT archives (downloadable) , lib and resources.
> This is a confusion point.
>
> 2. The directions to import UMLS subset are then outdated as well. Maybe
> one should use the old version (ctakes 2.5 and ytex 0.8) to
> import the RRF files for the UMLS subset and then just use the resulting
> db. Thoughts?
>
> Thanks,
> Vlad Valtchinov
> Brigham Rad
>
>
> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
>
>> Hi Vlad,
>>
>>
> All of ytex has been moved into ctakes, it is currently in a branch (
>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You don't have
>> to install ytex-0.8 - instead you will have to build and install from the
>> ytex branch to create your own distribution.  Steps 2 & 3 are correct.
>>
>> Although it is a pain, if you have the jdk, maven, and svn, you can
>> easily build your own distro:
>> * open a command prompt
>> * make sure jdk, maven, and svn are in your path
>> * cd to some directory where you want to check stuff out (I like c:\temp)
>> * run the following commands
>> rmdir /s /q ctakes
>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes
>> cd ctakes
>> mvn clean install -DskipTests
>>
>> And you will have the ctakes (with ytex) distro in
>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip
>>
>> What is the process for getting the ytex branch merged into trunk?  As I
>> mentioned, there are very few changes to other ctakes classes/types - this
>> should be completely complementary and not affect any existing ctakes
>> functionality.
>>
>> -vj
>>
>>
>>
>>
>>
>>
>> On Thu, Jan 30, 2014 at 4:56 PM,  wrote:
>>
>>> Hi VJ--
>>>
>>> this is great!! Thanks for all the hard work on it!
>>>
>>> We're starting to look into the new install. For now we're trying the
>>> binaries out.
>>>
>>> There were these questions about the proper install steps:
>>>
>>> 1. Do we first install ytex-0.8
>>> 2. Then install the new cTakes 3.1.1 instance and also apply the
>>> SNAPSHOT lib and resources zips
>>> 3. Work our way to install the UMLS ontologies in the db
>>>
>>> Its is not entirely clear from the new document (
>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_
>>> 1?ts=1388793998&updated=Installation_cTAKES_3_1)
>>> if there's still need to install ytex-0.8, or YTEX has been entirely
>>> merged into cTakes?
>>>
>>> If the last statement is correct, there are missing parts in i.e the
>>> UMLS install steps that are linked from the new ctakes 3.1.1 document.
>>>
>>> Thanks,
>>> vlad
>>>
>>>
>>> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote:

 Hello All,

 I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.
  Most of the YTEX functionality has been ported and integrated with cTAKES,
 and I've tested with MySQL and MS SQL Server (oracle tests pending).

 Most of the changes were made in new projects - very little existing
 cTAKES code has been modified.  The only non-trivial changes are
 in 
 /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api
 - here I modified CharacterOffsetToLineTokenConverterCtakesImpl &
 SingleDocumentProcessorCtakes to deal with newlines within sentences
 correctly.  Can somebody take a look at the changes in the ytex branch?

 I believe that the branch https://svn.apache.org/
 repos/asf/ctakes/branches/ytex is ready to be merged into ctakes
 trunk, but would like other users to test it as well.  Questions:

 * How can I distribute the ctakes binary distribution to ytex users
 before the merge? Can we make the branch build available somewhere?  The
 binary distribution is too large to host on the ytex google code site (max
 200 MB)
 * Non-ASF libraries - I have segregated these out into their own zip
 file t

Re: YTEX cTAKES 3.1.1 ready

2014-01-30 Thread vijay garla
Hi Vlad,

All of ytex has been moved into ctakes, it is currently in a branch (
https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You don't have to
install ytex-0.8 - instead you will have to build and install from the ytex
branch to create your own distribution.  Steps 2 & 3 are correct.

Although it is a pain, if you have the jdk, maven, and svn, you can easily
build your own distro:
* open a command prompt
* make sure jdk, maven, and svn are in your path
* cd to some directory where you want to check stuff out (I like c:\temp)
* run the following commands
rmdir /s /q ctakes
svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes
cd ctakes
mvn clean install -DskipTests

And you will have the ctakes (with ytex) distro in
ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip

What is the process for getting the ytex branch merged into trunk?  As I
mentioned, there are very few changes to other ctakes classes/types - this
should be completely complementary and not affect any existing ctakes
functionality.

-vj






On Thu, Jan 30, 2014 at 4:56 PM,  wrote:

> Hi VJ--
>
> this is great!! Thanks for all the hard work on it!
>
> We're starting to look into the new install. For now we're trying the
> binaries out.
>
> There were these questions about the proper install steps:
>
> 1. Do we first install ytex-0.8
> 2. Then install the new cTakes 3.1.1 instance and also apply the SNAPSHOT
> lib and resources zips
> 3. Work our way to install the UMLS ontologies in the db
>
> Its is not entirely clear from the new document (
> https://code.google.com/p/ytex/wiki/Installation_cTAKES_
> 3_1?ts=1388793998&updated=Installation_cTAKES_3_1)
> if there's still need to install ytex-0.8, or YTEX has been entirely
> merged into cTakes?
>
> If the last statement is correct, there are missing parts in i.e the UMLS
> install steps that are linked from the new ctakes 3.1.1 document.
>
> Thanks,
> vlad
>
>
> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote:
>>
>> Hello All,
>>
>> I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.  Most
>> of the YTEX functionality has been ported and integrated with cTAKES, and
>> I've tested with MySQL and MS SQL Server (oracle tests pending).
>>
>> Most of the changes were made in new projects - very little existing
>> cTAKES code has been modified.  The only non-trivial changes are
>> in 
>> /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api
>> - here I modified CharacterOffsetToLineTokenConverterCtakesImpl &
>> SingleDocumentProcessorCtakes to deal with newlines within sentences
>> correctly.  Can somebody take a look at the changes in the ytex branch?
>>
>> I believe that the branch https://svn.apache.org/
>> repos/asf/ctakes/branches/ytex is ready to be merged into ctakes trunk,
>> but would like other users to test it as well.  Questions:
>>
>> * How can I distribute the ctakes binary distribution to ytex users
>> before the merge? Can we make the branch build available somewhere?  The
>> binary distribution is too large to host on the ytex google code site (max
>> 200 MB)
>> * Non-ASF libraries - I have segregated these out into their own zip file
>> that can be distributed via sourceforge.  As a stopgap, I can upload this
>> to the ytex google code site, but would prefer to upload to sourceforge.
>> * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
>> * Documentation - How can I update the confluence docs?  I would migrate
>> the documentation from the google code website.
>>
>> Here the installation instructions (putting the wagon in front of the
>> horse ...)
>>
>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_
>> 3_1?ts=1388793998&updated=Installation_cTAKES_3_1
>>
>> Best,
>>
>> VJ
>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "ytex-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ytex-users+unsubscr...@googlegroups.com.
> To post to this group, send email to ytex-us...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ytex-users/70f03a80-ce1a-4c0e-b35d-5116d1c93ea0%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>


Re: YTEX cTAKES 3.1.1 ready

2014-01-30 Thread vlad . valtchinov
Hi VJ--

this is great!! Thanks for all the hard work on it!

We're starting to look into the new install. For now we're trying the 
binaries out.

There were these questions about the proper install steps:

1. Do we first install ytex-0.8
2. Then install the new cTakes 3.1.1 instance and also apply the SNAPSHOT 
lib and resources zips
3. Work our way to install the UMLS ontologies in the db

Its is not entirely clear from the new document (
https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Installation_cTAKES_3_1
)
if there's still need to install ytex-0.8, or YTEX has been entirely merged 
into cTakes?

If the last statement is correct, there are missing parts in i.e the UMLS 
install steps that are linked from the new ctakes 3.1.1 document.

Thanks,
vlad

On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote:
>
> Hello All,
>
> I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.  Most 
> of the YTEX functionality has been ported and integrated with cTAKES, and 
> I've tested with MySQL and MS SQL Server (oracle tests pending).
>
> Most of the changes were made in new projects - very little existing 
> cTAKES code has been modified.  The only non-trivial changes are 
> in 
> /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api 
> - here I modified CharacterOffsetToLineTokenConverterCtakesImpl & 
> SingleDocumentProcessorCtakes to deal with newlines within sentences 
> correctly.  Can somebody take a look at the changes in the ytex branch?
>
> I believe that the branch 
> https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be 
> merged into ctakes trunk, but would like other users to test it as well. 
>  Questions:
>
> * How can I distribute the ctakes binary distribution to ytex users before 
> the merge? Can we make the branch build available somewhere?  The binary 
> distribution is too large to host on the ytex google code site (max 200 MB)
> * Non-ASF libraries - I have segregated these out into their own zip file 
> that can be distributed via sourceforge.  As a stopgap, I can upload this 
> to the ytex google code site, but would prefer to upload to sourceforge.
> * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
> * Documentation - How can I update the confluence docs?  I would migrate 
> the documentation from the google code website.
>
> Here the installation instructions (putting the wagon in front of the 
> horse ...)
>
>
> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Installation_cTAKES_3_1
>
> Best,
>
> VJ
>
>
>

Re: YTEX cTAKES 3.1.1 ready

2014-01-08 Thread vijay garla
Hi Bhaskar,

Thanks for working on this!

I am not sure what is going wrong, but can you try doing a clean checkout
and a "mvn clean install" from the command line?  When I do this, all tests
in all projects pass, no changes necessary.  I have never had the eclipse
maven m2e plugin work flawlessly (congratulations to those of you who do);
if the command line mvn clean install works, then I need to figure out why
the build doesn't work from eclipse.  I am using the 64-bit eclipse kepler,
jdk 1.7, and maven 3.1.0; for me none of the projects with jcasgen plugins
compile from eclipse.

Regarding the class not found exceptions: these classes are in
CTAKES_HOME/lib/ctakes-core-3.1.2-SNAPSHOT.jar - can you make sure the
classes are there?  If not, something went wrong in the build of
ctakes-core (again please verify that that this works when you run maven
from the command line).

I run this batch script which does a checkout, install, ytex setup, and
runs the ytex CPE in a single go:

@REM c:\java\setenv.bat - puts java, maven, and svn in the PATH
@REM c:\temp\ctakes-build - where ctakes gets checked out
@REM c:\temp - where I downloaded the ctakes resources, ytex resources &
lib files
@REM c:\java\apache-ctakes-3.1.2-SNAPSHOT - where ctakes get's installed

call c:\java\setenv.bat
cd C:\temp\ctakes-build\ctakes
rmdir /s /q ctakes
svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex ctakes
cd ctakes
@rem need to unset ctakes home
set CTAKES_HOME=
call mvn clean install
cd c:\java
rmdir /s /q apache-ctakes-3.1.2-SNAPSHOT
jar xf
C:\temp\ctakes-build\ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-bin.zip
cd apache-ctakes-3.1.2-SNAPSHOT
jar xf c:\temp\ctakes-resources-3.1.0.zip
jar xf c:\temp\ctakes-ytex-resources-3.1.2-SNAPSHOT.zip
jar xf c:\temp\ctakes-ytex-lib-3.1.2-SNAPSHOT.zip
@rem stop here if you don't need to do a ytex setup
@rem adjust this to match your environment-  to use a different DB, copy a
different ytex.properties file
copy resources\org\apache\ctakes\ytex\ytex.properties.mssql.example
resources\org\apache\ctakes\ytex\ytex.properties
cd bin\ctakes-ytex\scripts
call ..\..\ant.bat -f build-setup.xml all > setup.out 2>&1
cd ..\..\..
call bin\setenv.bat
java -cp "%CLASSPATH%"
 -Dlog4j.configuration=file:/%CTAKES_HOME%/config/log4j.xml -Xms512M -Xmx2g
org.apache.ctakes.ytex.tools.RunCPE
 desc\ctakes-ytex-uima\desc\cpe\fracture_demo.xml

Best,

Vijay



On Wed, Jan 8, 2014 at 7:41 AM, Bhaskar B  wrote:

> Hi Vijay,
>
> Thank you for this update.  In order to evaluate the YTEX port into
> cTAKES, I wanted to do the following (goals):
>
> (a) Compile the ctakes/branches/ytex to create
> apache-ctakes-3.1.2-SNAPSHOT.
> (b) Validate that the just compiled binary works by running either the
> AggregatePlaintextProcessor.xml or AggregatePlaintextUMLSProcessor.xml
> pipelines (basically following instructions at
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide
> ).
> (c) Follow instructions at
> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1 to explore
> YTEX specific features, e.g. pipelines, writing annotations to database,
> etc.
>
> So I took the following steps:
>
> 1) Followed the instructions at
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+Developer+Install+Guideand
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guideto
>  successfully (i) configure Eclipse (Juno) environment for building
> cTAKES, and (ii) pull the source code from the SVN repository branch:
> https://svn.apache.org/repos/asf/ctakes/branches/ytex.
>
> 2) In Eclipse, right clicked on the top-most (or root level) pom.xml,
> selected Run As -> Maven build, typed in "compile" as the goal, and hit
> Run.  This successfully compiled all the projects.
>
> 3) In Eclipse, repeated step (2) but selected Run As -> Maven install to
> create a distribution.  However this is where I started to encounter few
> problems.  I was eventually able to get Maven install to complete and
> create the binaries (i.e. in ctakes-distribution/target/) but by manually
> doing the following:
>
> 3.1) in pom.xml of ctakes-ytex-uima: excluded all tests
> 3.2) in pom.xml of ctakes-core: excluded 2 tests
> 3.3) in ctakes-ytex: modified scripts/build-classpath.xml and
> scripts/build-setup.xml to hardcode path to ANT library
> 3.4) in ctakes-dependency-parser: excluded 1 unit test
>
> 4) After this step, I extracted apache-ctakes-3.1.2-SNAPSHOT-bin.zip and
> attempted to verify (i.e. step (b)) above.  However when I attempted to
> load AggregatePlaintextProcessor, I am getting exception (below).
>
> While I continue to look to resolve this, any tips/hints that you could
> provide to get this build functional would be highly appreciated.  I think
> I may be missing one or more key steps/configuration.  My workstation is
> Windows 7 and I use Eclipse Juno.
>
> -
> java.lang.Error: Unresolved compilation problems:
> The 

Re: YTEX cTAKES 3.1.1 ready

2014-01-08 Thread Bhaskar B
Hi Vijay,

Thank you for this update.  In order to evaluate the YTEX port into cTAKES, 
I wanted to do the following (goals):

(a) Compile the ctakes/branches/ytex to create apache-ctakes-3.1.2-SNAPSHOT.
(b) Validate that the just compiled binary works by running either the 
AggregatePlaintextProcessor.xml or AggregatePlaintextUMLSProcessor.xml 
pipelines (basically following instructions at 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide).
(c) Follow instructions at 
https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1 to explore YTEX 
specific features, e.g. pipelines, writing annotations to database, etc.

So I took the following steps:

1) Followed the instructions at 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+Developer+Install+Guide
 
and 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+Install+Guide
 
to successfully (i) configure Eclipse (Juno) environment for building 
cTAKES, and (ii) pull the source code from the SVN repository branch: 
https://svn.apache.org/repos/asf/ctakes/branches/ytex.

2) In Eclipse, right clicked on the top-most (or root level) pom.xml, 
selected Run As -> Maven build, typed in "compile" as the goal, and hit 
Run.  This successfully compiled all the projects.

3) In Eclipse, repeated step (2) but selected Run As -> Maven install to 
create a distribution.  However this is where I started to encounter few 
problems.  I was eventually able to get Maven install to complete and 
create the binaries (i.e. in ctakes-distribution/target/) but by manually 
doing the following:

3.1) in pom.xml of ctakes-ytex-uima: excluded all tests
3.2) in pom.xml of ctakes-core: excluded 2 tests
3.3) in ctakes-ytex: modified scripts/build-classpath.xml and 
scripts/build-setup.xml to hardcode path to ANT library
3.4) in ctakes-dependency-parser: excluded 1 unit test

4) After this step, I extracted apache-ctakes-3.1.2-SNAPSHOT-bin.zip and 
attempted to verify (i.e. step (b)) above.  However when I attempted to 
load AggregatePlaintextProcessor, I am getting exception (below).

While I continue to look to resolve this, any tips/hints that you could 
provide to get this build functional would be highly appreciated.  I think 
I may be missing one or more key steps/configuration.  My workstation is 
Windows 7 and I use Eclipse Juno.

-
java.lang.Error: Unresolved compilation problems:
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.machine cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.output cannot be resolved
The import org.apache.ctakes.core.fsm.token.BaseToken cannot be 
resolved

The import org.apache.ctakes.core.fsm.token.EolToken cannot be 
resolved
DateFSM cannot be resolved to a type
TimeFSM cannot be resolved to a type
FractionFSM cannot be resolved to a type
RomanNumeralFSM cannot be resolved to a type
RangeFSM cannot be resolved to a type
MeasurementFSM cannot be resolved to a type
PersonTitleFSM cannot be resolved to a type
DateFSM cannot be resolved to a type
DateFSM cannot be resolved to a type
TimeFSM cannot be resolved to a type
TimeFSM cannot be resolved to a type
FractionFSM cannot be resolved to a type
FractionFSM cannot be resolved to a type
RomanNumeralFSM cannot be resolved to a type
RomanNumeralFSM cannot be resolved to a type
RangeFSM cannot be resolved to a type
RangeFSM cannot be resolved to a type
MeasurementFSM cannot be resolved to a type
MeasurementFSM cannot be resolved to a type
PersonTitleFSM cannot be resolved to a type
PersonTitleFSM cannot be resolved to a type
BaseToken cannot be resolved to a type
BaseToken cannot be resolved to a type
BaseToken cannot be resolved to a type
The method adaptToBaseToken(BaseToken) from the type 
ContextDependentTok
enizerAnnotator refers to the missing type BaseToken
EolToken cannot be resolved to a type
BaseToken cannot be re

Re: YTEX cTAKES 3.1.1 ready

2014-01-07 Thread vijay garla
see answers inline


On Tue, Jan 7, 2014 at 10:35 AM, Chen, Pei
wrote:

> > * How can I distribute the ctakes binary distribution to ytex users
> before the
> > merge? Can we make the branch build available somewhere?  The binary
> > distribution is too large to host on the ytex google code site (max 200
> MB)
> Is this for testing purposes?  Or official release? If it's just for
> testing, there will be more options...
> Are you referring to the convenience binary/zip file?  Or maven artifacts
> that could be deployed to the SNAPSHOTS repo [1]?
> If it's for testing, you can always have users build from source via mvn
> package (assuming you added the ytex* to the ctakes-distribution module)?
> Again if it's for testing, you can always try the svn or home dir.  But
> it's not the recommended channel for actual distribution to users because
> that normally has to go through the normal release process (Voting, etc.).
>

This is for testing.  Ytex has been added to the ctakes distro


>
> > * Non-ASF libraries - I have segregated these out into their own zip
> file that
> > can be distributed via sourceforge.  As a stopgap, I can upload this to
> the ytex
> > google code site, but would prefer to upload to sourceforge.
> Are these optional 3rd party libs available via maven central?
>

Most of them are.  The only exception is the MS SQL Driver, which is freely
redistributable (see http://msdn.microsoft.com/en-us/sqlserver/aa937725).
 I did not find anything similar for the oracle jdbc driver so I left that
out (users will have to download that separately).

The zip is here:
https://ytex.googlecode.com/files/ctakes-ytex-lib-3.1.2-SNAPSHOT.zip


>
> > * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
> Are you planning to distribute them via maven central?  I think it would
> be nice to make these available as maven artifacts.
> If so, what is your sourceforge id? We can grant you access to the
> existing ctakes resourcse project [2]:
> The pom.xml is already setup to upload to OSS Sonatype (request a login
> for oss sonatype to perform a mvn deploy for the actual upload later on)...
>

I have placed the umls resources behind a server that requires UTS
authentication (note that this obviates the need for supplying umls
username and password in ctakes config files/scripts).

The umls resources are here:
http://www.ytex-nlp.org/umls.download/secure/3.1/ctakes-ytex-resources-3.1.2-SNAPSHOT.zip

This is a plain old apache http server with the module for CAS (the other
CAS) authentication.  If ctakes has an apache server somewhere, we could do
the same.


>
> > * Documentation - How can I update the confluence docs?  I would migrate
> > the documentation from the google code website.
> This would be great; You've been added to the cTAKES confluence space [3].
>
> Downloading the code now... To be continued...
>
> [1]
> https://repository.apache.org/content/groups/snapshots/org/apache/ctakes/
> [2] http://sourceforge.net/p/ctakesresources/code/HEAD/tree/trunk/
> [3] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES
>
> > -Original Message-
> > From: vijay garla [mailto:vnga...@gmail.com]
> > Sent: Friday, January 03, 2014 10:23 PM
> > To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org
> > Subject: YTEX cTAKES 3.1.1 ready
> >
> > Hello All,
> >
> > I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.
>  Most of the
> > YTEX functionality has been ported and integrated with cTAKES, and I've
> > tested with MySQL and MS SQL Server (oracle tests pending).
> >
> > Most of the changes were made in new projects - very little existing
> cTAKES
> > code has been modified.  The only non-trivial changes are in /ctakes-
> > assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api
> > - here I modified CharacterOffsetToLineTokenConverterCtakesImpl &
> > SingleDocumentProcessorCtakes to deal with newlines within sentences
> > correctly.  Can somebody take a look at the changes in the ytex branch?
> >
> > I believe that the branch
> > https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be
> > merged into ctakes trunk, but would like other users to test it as well.
> >  Questions:
> >
> > * How can I distribute the ctakes binary distribution to ytex users
> before the
> > merge? Can we make the branch build available somewhere?  The binary
> > distribution is too large to host on the ytex google code site (max 200
> MB)
> > * Non-ASF libraries - I have segregated these out into their own zip
> file that
> > can be distributed via sourceforge.  As a stopgap, I can upload this to
> the ytex
> > google code site, but would prefer to upload to sourceforge.
> > * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
> > * Documentation - How can I update the confluence docs?  I would migrate
> > the documentation from the google code website.
> >
> > Here the installation instructions (putting the wagon in front of the
> hor

RE: YTEX cTAKES 3.1.1 ready

2014-01-07 Thread Chen, Pei
> * How can I distribute the ctakes binary distribution to ytex users before the
> merge? Can we make the branch build available somewhere?  The binary
> distribution is too large to host on the ytex google code site (max 200 MB)
Is this for testing purposes?  Or official release? If it's just for testing, 
there will be more options...
Are you referring to the convenience binary/zip file?  Or maven artifacts that 
could be deployed to the SNAPSHOTS repo [1]?
If it's for testing, you can always have users build from source via mvn 
package (assuming you added the ytex* to the ctakes-distribution module)?
Again if it's for testing, you can always try the svn or home dir.  But it's 
not the recommended channel for actual distribution to users because that 
normally has to go through the normal release process (Voting, etc.). 

> * Non-ASF libraries - I have segregated these out into their own zip file that
> can be distributed via sourceforge.  As a stopgap, I can upload this to the 
> ytex
> google code site, but would prefer to upload to sourceforge.
Are these optional 3rd party libs available via maven central?

> * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
Are you planning to distribute them via maven central?  I think it would be 
nice to make these available as maven artifacts.
If so, what is your sourceforge id? We can grant you access to the existing 
ctakes resourcse project [2]:
The pom.xml is already setup to upload to OSS Sonatype (request a login for oss 
sonatype to perform a mvn deploy for the actual upload later on)...

> * Documentation - How can I update the confluence docs?  I would migrate
> the documentation from the google code website.
This would be great; You've been added to the cTAKES confluence space [3].

Downloading the code now... To be continued...

[1] https://repository.apache.org/content/groups/snapshots/org/apache/ctakes/
[2] http://sourceforge.net/p/ctakesresources/code/HEAD/tree/trunk/
[3] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES

> -Original Message-
> From: vijay garla [mailto:vnga...@gmail.com]
> Sent: Friday, January 03, 2014 10:23 PM
> To: ytex-us...@googlegroups.com; ctakes-...@incubator.apache.org
> Subject: YTEX cTAKES 3.1.1 ready
> 
> Hello All,
> 
> I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.  Most of 
> the
> YTEX functionality has been ported and integrated with cTAKES, and I've
> tested with MySQL and MS SQL Server (oracle tests pending).
> 
> Most of the changes were made in new projects - very little existing cTAKES
> code has been modified.  The only non-trivial changes are in /ctakes-
> assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api
> - here I modified CharacterOffsetToLineTokenConverterCtakesImpl &
> SingleDocumentProcessorCtakes to deal with newlines within sentences
> correctly.  Can somebody take a look at the changes in the ytex branch?
> 
> I believe that the branch
> https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be
> merged into ctakes trunk, but would like other users to test it as well.
>  Questions:
> 
> * How can I distribute the ctakes binary distribution to ytex users before the
> merge? Can we make the branch build available somewhere?  The binary
> distribution is too large to host on the ytex google code site (max 200 MB)
> * Non-ASF libraries - I have segregated these out into their own zip file that
> can be distributed via sourceforge.  As a stopgap, I can upload this to the 
> ytex
> google code site, but would prefer to upload to sourceforge.
> * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
> * Documentation - How can I update the confluence docs?  I would migrate
> the documentation from the google code website.
> 
> Here the installation instructions (putting the wagon in front of the horse
> ...)
> 
> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=13887939
> 98&updated=Installation_cTAKES_3_1
> 
> Best,
> 
> VJ


Re: YTEX cTAKES 3.1.1 ready

2014-01-04 Thread Chen, Pei
This is awesome VJ.  
I'll take a look at it this week unless someone beats me to it

> On Jan 3, 2014, at 10:22 PM, "vijay garla"  wrote:
> 
> Hello All,
> 
> I have finished an initial cut at the port of YTEX to cTAKES 3.1.1.  Most
> of the YTEX functionality has been ported and integrated with cTAKES, and
> I've tested with MySQL and MS SQL Server (oracle tests pending).
> 
> Most of the changes were made in new projects - very little existing cTAKES
> code has been modified.  The only non-trivial changes are
> in 
> /ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api
> - here I modified CharacterOffsetToLineTokenConverterCtakesImpl &
> SingleDocumentProcessorCtakes to deal with newlines within sentences
> correctly.  Can somebody take a look at the changes in the ytex branch?
> 
> I believe that the branch
> https://svn.apache.org/repos/asf/ctakes/branches/ytex is ready to be merged
> into ctakes trunk, but would like other users to test it as well.
> Questions:
> 
> * How can I distribute the ctakes binary distribution to ytex users before
> the merge? Can we make the branch build available somewhere?  The binary
> distribution is too large to host on the ytex google code site (max 200 MB)
> * Non-ASF libraries - I have segregated these out into their own zip file
> that can be distributed via sourceforge.  As a stopgap, I can upload this
> to the ytex google code site, but would prefer to upload to sourceforge.
> * UMLS Derivatives - Ditto for these - would like to move to sourceforge.
> * Documentation - How can I update the confluence docs?  I would migrate
> the documentation from the google code website.
> 
> Here the installation instructions (putting the wagon in front of the horse
> ...)
> 
> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Installation_cTAKES_3_1
> 
> Best,
> 
> VJ