RE: docs on running Clinical Document Pipeline from Java?

Chen, Pei Fri, 21 Jun 2013 10:56:57 -0700

Hi,
One can include the ctakes code as a maven dependency, however there is a 
current limitation- in order to run the pipeline, it essentially needs the 
/desc and /resources unpacked somewhere on disk.  There is an effort to 
streamline the resource loading so that it will make it easier to integrate the 
modules.
Until then, one will need to essentially perform an "mvn package -DskipTests" 
to package everything into a single package into ctakes-distribution/target.
Maybe there are other ways...
--Pei


> -----Original Message-----
> From: Sandy Ryza [mailto:[email protected]]
> Sent: Thursday, June 20, 2013 5:24 PM
> To: [email protected]
> Subject: Re: docs on running Clinical Document Pipeline from Java?
> 
> Thanks for the help!  Is there any advice on the best way to include ctakes as
> a dependency?  I've tried writing some code that points to
> AggregatePlaintextUMLSProcessor.xml, but it doesn't know where to find
> the other files that are referred to.  Is there any good way to package ctakes
> up and refer to a unit?  We want to be able to distribute something that
> relies on ctakes in a cluster.
> 
> (Here's the error I'm getting)
> Import failed.  Could not read from URL
> file:/home/sandy/ctakes-dependency-
> parser/desc/analysis_engine/ClearParserDependencyParserAE.xml.
> (Descriptor:
> file:/home/sandy/datascience/Mayo_cTAKES/mr/AggregatePlaintextUMLSP
> rocessor.xml)
> 
> -Sandy
> 
> 
> On Wed, Jun 19, 2013 at 2:30 PM, Andy McMurry
> <[email protected]>wrote:
> 
> > Note: The WEKA gui reports the command line arguments for any GUI task.
> > It could be a very helpful timesaver if cTAKES had a similar feature.
> >
> > Otherwise, I fear we will be writing Main methods and docs for each
> > and every cTAKES task.
> > What do you all think?
> >
> > -------
> >
> > Real world example of how this works in Weka.
> > Say you wanted to run Adaboost on a C4.5 decision tree with cost
> > sensitive classification.
> > Weka reports the arguments, which I can re-run from command line
> >
> > Classifier csc = new CostSensitiveClassifier();
> >
> >         String[] adaboost = {
> >                 "-cost-matrix", costMatrix,
> >                 "-S", "1",
> >                 "-W", "weka.classifiers.meta.AdaBoostM1",
> >                 "--",
> >                 "-P", "100",
> >                 "-S", "1",
> >                 "-I", "30",
> >                 //
> >                 "-W", "weka.classifiers.trees.J48",
> >                 "--",
> >                 "-C", String.valueOf(j48Confidence),
> >                 "-M", String.valueOf(j48MinObjects)
> >         };
> >
> > csc.setOptions(adaboost);
> >
> >
> >
> >
> >
> >
> >
> >
> > On Jun 19, 2013, at 5:20 PM, "Chen, Pei"
> > <[email protected]>
> > wrote:
> >
> > > Also,
> > > Tim recently just checked in a Main class that essentially could be
> > > the
> > beginnings of a Driver program.
> > > Check the main() out at:
> > >
> > http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-clinical-pipeline/
> > src/main/java/org/apache/ctakes/clinicalpipeline/runtime/BagOfCUIsGene
> > rator.java
> > >
> > > --Pei
> > >
> > >
> > >> -----Original Message-----
> > >> From: Girivaraprasad Nambari [mailto:[email protected]]
> > >> Sent: Wednesday, June 19, 2013 3:47 PM
> > >> To: [email protected]
> > >> Subject: Re: docs on running Clinical Document Pipeline from Java?
> > >>
> > >> Hi,
> > >>
> > >> Welcome to ctakes.
> > >>
> > >> There was a similar discussion initiated by me few months ago (you
> > >> may
> > be
> > >> able to find out if you browse through old discussions) . Here is
> > response
> > >> form Pei Chen & ctakes community:
> > >>
> > >> It is not quite prime time ready but, take a look peek at the below
> > >> (It
> > uses
> > >> uimaFIT to do the above):
> > >>
> > >> **
> > >>
> > >> http://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-
> > >>
> > gui/src/main/java/org/chboston/cnlp/ctakes/gui/service/LauncherService
> > .ja
> > >> va
> > >> ****
> > >>
> > >> ** Essentially, it boils down to a few lines of code:
> > >>
> > >> AnalysisEngine aggregateAE =
> > >> AnalysisEngineFactory.createAggregate(****
> > >>
> > >>               engines, componentNames, typeSystemDescription,
> > >> null,****
> > >>
> > >>               new SofaMapping[0]);****
> > >>
> > >>              ****
> > >>
> > >> JCas jcas = aggregateAE.newJCas();****
> > >>
> > >> jcas.setDocumentText(doc.getText());****
> > >>
> > >> aggregateAE.process(jcas);
> > >>
> > >>
> > >> We need to start from UIMA and UIMAfit to get some basic
> > >> understanding, then using ctakes component will be easy.
> > >>
> > >> Good luck!
> > >>
> > >> Thank you,
> > >>
> > >> Giri
> > >>
> > >>
> > >> On Wed, Jun 19, 2013 at 3:17 PM, Sandy Ryza
> > >> <[email protected]>
> > >> wrote:
> > >>
> > >>> Hi cTAKES folks,
> > >>>
> > >>> I am trying to figure out how to run the Clinical Document
> > >>> Pipeline from Java.  All the documentation I have found so far has
> > >>> been about how to do this through a GUI.  Is there anything on how
> > >>> to run the pipeline programmatically?
> > >>>
> > >>> thanks for any help!
> > >>> Sandy
> > >>>
> >
> >

RE: docs on running Clinical Document Pipeline from Java?

Reply via email to