Hi Sean,

Thank you very much for your kind and prompt reply. I used build profile
"runPiperGui"and it worked beautifully on my custom piper file. I
appreciate it if you kindly direct me to next steps on more run methods. As
I mentioned earlier, my final goal is to develop a OCR+NLP web service.

Thanks,
Maral

On Fri, Jul 19, 2019 at 12:08 PM Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Hi Maral,
>
> There are two slightly different directory structures.  One for
> development (source structure), another for end use (installation
> structure).
>
> Since you have a copy of the source, lets start with that.
>
> Are you using an IDE (Integrated Development Environment) such as IntelliJ
> or Eclipse?
> If so then you should be able to create a run profile that can run
> pipelines.  There is plenty of help online for that kind of thing, and
> people on the mailing list can probably provide examples of what they have
> used for ctakes.
>
> If you are not using an ide, I suggest using the -PrunPiperGui maven
> profile that I mentioned below - just to run a test pipeline and see your
> output.  After you have a successful run then we can move on to other run
> methods.
>
> When you use an ide run profile or a maven profile you don't need to
> specify $CTAKES_HOME or worry about the classpath or bin/.
>
> Sean
>
> ________________________________________
> From: Maral Amir <maraljav...@gmail.com>
> Sent: Friday, July 19, 2019 2:12 PM
> To: dev@ctakes.apache.org
> Subject: Re: cTAKES Pipeline [EXTERNAL]
>
> Hi Sean,
>
> Thank you so much for your insightful response.
> I'm having a problem linking the piper files. I should mention I am using
> command line interface. Could you please kindly let me know:
>
> 1. What I should set my CTAKES_HOME variable into. Right now I set my
> CTAKES_HOME to my cTAKES user installation main folder. That is because I
> could see in the last line of the runPiperFile.sh, the class directory
> $CTAKES_HOME/lib/* is included and no /lib folder is present in the
> developer's version.
>
> java -cp
>
> $CTAKES_HOME/desc/:$CTAKES_HOME/resources/:$CTAKES_HOME/resources/resources:$CTAKES_HOME/lib/*
> -Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms512M -Xmx3g
> org.apache.ctakes.core.pipeline.PiperFileRunner "$@"
>
>
> Also,
>
> 2. Where is the *"bin"* folder where the bash file resides. Right now I use
> this one :
> /Users/local/projects/ctakes/trunk/ctakes-distribution/src/main/bin
>
>
> Thanks,
> Maral
>
> On Fri, Jul 19, 2019 at 6:13 AM Finan, Sean <
> sean.fi...@childrens.harvard.edu> wrote:
>
> > Hi Maral,
> >
> > You can generate different output types by adding different writers to
> the
> > end of the pipeline.
> > Here are the contents of the Default Clinical Pipeline piper file:
> >
> >
> >
> ========================================================================================
> > // Commands and parameters to create a default plaintext document
> > processing pipeline with UMLS lookup
> >
> > // Load a simple token processing pipeline from another pipeline file
> > load DefaultTokenizerPipeline
> >
> > // Add non-core annotators
> > add ContextDependentTokenizerAnnotator
> > addDescription POSTagger
> >
> > // Add Chunkers
> > load ChunkerSubPipe
> >
> > // Default fast dictionary lookup
> > load DictionarySubPipe
> >
> > // Add Cleartk Entity Attribute annotators
> > load AttributeCleartkSubPipe
> >
> >
> ========================================================================================
> >
> >
> > I recommend that you copy those lines to a new file (for instance,
> > Maral.piper) and then add the following lines:
> >
> >
> >
> ========================================================================================
> > // Write marked copy of note text in interactive html files
> > add pretty.html.HtmlTextWriter SubDirectory=HTML
> >
> > // Write Fast Health Interoperability Resources (FHIR) json files.
> > fhir.org
> > package org.apache.ctakes.fhir.cc
> > add FhirJsonFileWriter SubDirectory=FHIR
> >
> > // Write plaintext copy of note text with cui, semantic group, POS.
> > Relations are listsed.
> > add pretty.plaintext.PrettyTextWriterFit SubDirectory=TEXT
> >
> > // Write plaintext copy of note sentences with entity and relation
> > disveries listed.
> > add property.plaintext.PropertyTextWriterFit SubDirectory=PROP
> >
> >
> ========================================================================================
> >
> >
> > The output directory should then contain some new output in different
> > subdirectories.  You can change the subdirectory names.
> >
> > Note: the "=================================" are just there to indicate
> > what is for the file.  Do not copy them.
> >
> > There are many more file writers, most of which write simple lists of
> > discoveries in one form or another.
> > I recommend trying the 4 above and see if any fit your purposes before
> > moving on to more specialized writers.
> >
> > Sean
> >
> > ________________________________________
> > From: Maral Amir <maraljav...@gmail.com>
> > Sent: Thursday, July 18, 2019 7:11 PM
> > To: dev@ctakes.apache.org
> > Subject: Re: cTAKES Pipeline [EXTERNAL]
> >
> > Hi Sean,
> >
> > Thank you so much for your very helpful and comprehensive response. I was
> > able to generate the xmi results in the output directory and used UIMA
> Cas
> > Visual Debugger (CVD) as suggested to view the information. I have two
> > questions:
> > 1. What is the best reference for me to study and understand the
> > annotations.
> > 2. Is there a CLI equivalent to CVD? I need the annotated outputs in a
> > readable format without the help of CVD.
> >
> > Thanks,
> > Maral
> >
> >
> > On Thu, Jul 18, 2019 at 12:52 PM Finan, Sean <
> > sean.fi...@childrens.harvard.edu> wrote:
> >
> > > Hi Maral,
> > >
> > > This might be what you are talking about with respect to the Default
> > > Clinical Pipeline
> > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=cBb87McNP4vp678BVVM6z9Wwfr_CQNb--5XKAUPDxYM&e=
> > >
> > > That lists a command line method for running a set of files and getting
> > > xml output.
> > >
> > > The default clinical pipeline configuration is actually contained in
> the
> > > plain text (piper) file
> > > resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper
> > >
> > > If you are looking at source code then the file is
> > > ctakes-clinical-pipeline-res/src/main/resources/ ...
> > >
> > > You can also select and run a piper file with a gui
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=lTtwFsqMJEl1M73fifRpWrO6BZX_R0d2gh3HOqvAx90&e=
> > >
> > > Both methods are mentioned near the bottom of one of the pages
> detailing
> > > pipeline configuration
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=0VYZQYTmgYmbRW_vsbf8XACzsVWdetpqSxeDj_c8RKA&e=
> > >
> > > There are several example pipelines constructed with code and/or plain
> > > text files in the ctakes-examples and ctakes-examples-res modules.  You
> > can
> > > look at the different "Hello World" examples.
> > >
> > > Since you are playing with maven, you can run the profile
> "runPiperGui".
> > > mvn clean compile -DskipTests -PrunPiperGui
> > >
> > > Sean
> > >
> > >
> > > ________________________________________
> > > From: Maral Amir <maraljav...@gmail.com>
> > > Sent: Thursday, July 18, 2019 2:29 PM
> > > To: dev@ctakes.apache.org
> > > Subject: cTAKES Pipeline [EXTERNAL]
> > >
> > > Hi,
> > >
> > > I just build my developer version of cTAKES with the help of wonderful
> > > cTAKES developers.
> > >
> > > For my next step, I would appreciate if somebody direct me to a right
> > path.
> > > I am planning to process text clinical documents through the entire
> > > pipeline to generate xml output. I see the website suggest walking
> > through
> > > the Default Clinical Pipeline. I understand there are also multiple git
> > > repositories on developed command line tool based Apache cTAKES.
> > > My final goal is to integrate cTAKES with some Python packages( OCR,
> > etc.)
> > > into one pipeline and have some form of web service at the end. I would
> > > deeply appreciate any suggestions.
> > >
> > > Thanks,
> > > Maral
> > >
> >
>

Reply via email to