Re: cTAKES Pipeline [EXTERNAL]

Maral Amir Fri, 19 Jul 2019 11:12:56 -0700

Hi Sean,

Thank you so much for your insightful response.
I'm having a problem linking the piper files. I should mention I am using
command line interface. Could you please kindly let me know:


1. What I should set my CTAKES_HOME variable into. Right now I set my
CTAKES_HOME to my cTAKES user installation main folder. That is because I
could see in the last line of the runPiperFile.sh, the class directory
$CTAKES_HOME/lib/* is included and no /lib folder is present in the
developer's version.

java -cp
$CTAKES_HOME/desc/:$CTAKES_HOME/resources/:$CTAKES_HOME/resources/resources:$CTAKES_HOME/lib/*
-Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms512M -Xmx3g
org.apache.ctakes.core.pipeline.PiperFileRunner "$@"


Also,

2. Where is the *"bin"* folder where the bash file resides. Right now I use
this one :
/Users/local/projects/ctakes/trunk/ctakes-distribution/src/main/bin


Thanks,
Maral

On Fri, Jul 19, 2019 at 6:13 AM Finan, Sean <
[email protected]> wrote:

> Hi Maral,
>
> You can generate different output types by adding different writers to the
> end of the pipeline.
> Here are the contents of the Default Clinical Pipeline piper file:
>
>
> ========================================================================================
> // Commands and parameters to create a default plaintext document
> processing pipeline with UMLS lookup
>
> // Load a simple token processing pipeline from another pipeline file
> load DefaultTokenizerPipeline
>
> // Add non-core annotators
> add ContextDependentTokenizerAnnotator
> addDescription POSTagger
>
> // Add Chunkers
> load ChunkerSubPipe
>
> // Default fast dictionary lookup
> load DictionarySubPipe
>
> // Add Cleartk Entity Attribute annotators
> load AttributeCleartkSubPipe
>
> ========================================================================================
>
>
> I recommend that you copy those lines to a new file (for instance,
> Maral.piper) and then add the following lines:
>
>
> ========================================================================================
> // Write marked copy of note text in interactive html files
> add pretty.html.HtmlTextWriter SubDirectory=HTML
>
> // Write Fast Health Interoperability Resources (FHIR) json files.
> fhir.org
> package org.apache.ctakes.fhir.cc
> add FhirJsonFileWriter SubDirectory=FHIR
>
> // Write plaintext copy of note text with cui, semantic group, POS.
> Relations are listsed.
> add pretty.plaintext.PrettyTextWriterFit SubDirectory=TEXT
>
> // Write plaintext copy of note sentences with entity and relation
> disveries listed.
> add property.plaintext.PropertyTextWriterFit SubDirectory=PROP
>
> ========================================================================================
>
>
> The output directory should then contain some new output in different
> subdirectories.  You can change the subdirectory names.
>
> Note: the "=================================" are just there to indicate
> what is for the file.  Do not copy them.
>
> There are many more file writers, most of which write simple lists of
> discoveries in one form or another.
> I recommend trying the 4 above and see if any fit your purposes before
> moving on to more specialized writers.
>
> Sean
>
> ________________________________________
> From: Maral Amir <[email protected]>
> Sent: Thursday, July 18, 2019 7:11 PM
> To: [email protected]
> Subject: Re: cTAKES Pipeline [EXTERNAL]
>
> Hi Sean,
>
> Thank you so much for your very helpful and comprehensive response. I was
> able to generate the xmi results in the output directory and used UIMA Cas
> Visual Debugger (CVD) as suggested to view the information. I have two
> questions:
> 1. What is the best reference for me to study and understand the
> annotations.
> 2. Is there a CLI equivalent to CVD? I need the annotated outputs in a
> readable format without the help of CVD.
>
> Thanks,
> Maral
>
>
> On Thu, Jul 18, 2019 at 12:52 PM Finan, Sean <
> [email protected]> wrote:
>
> > Hi Maral,
> >
> > This might be what you are talking about with respect to the Default
> > Clinical Pipeline
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=cBb87McNP4vp678BVVM6z9Wwfr_CQNb--5XKAUPDxYM&e=
> >
> > That lists a command line method for running a set of files and getting
> > xml output.
> >
> > The default clinical pipeline configuration is actually contained in the
> > plain text (piper) file
> > resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper
> >
> > If you are looking at source code then the file is
> > ctakes-clinical-pipeline-res/src/main/resources/ ...
> >
> > You can also select and run a piper file with a gui
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=lTtwFsqMJEl1M73fifRpWrO6BZX_R0d2gh3HOqvAx90&e=
> >
> > Both methods are mentioned near the bottom of one of the pages detailing
> > pipeline configuration
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=0VYZQYTmgYmbRW_vsbf8XACzsVWdetpqSxeDj_c8RKA&e=
> >
> > There are several example pipelines constructed with code and/or plain
> > text files in the ctakes-examples and ctakes-examples-res modules.  You
> can
> > look at the different "Hello World" examples.
> >
> > Since you are playing with maven, you can run the profile "runPiperGui".
> > mvn clean compile -DskipTests -PrunPiperGui
> >
> > Sean
> >
> >
> > ________________________________________
> > From: Maral Amir <[email protected]>
> > Sent: Thursday, July 18, 2019 2:29 PM
> > To: [email protected]
> > Subject: cTAKES Pipeline [EXTERNAL]
> >
> > Hi,
> >
> > I just build my developer version of cTAKES with the help of wonderful
> > cTAKES developers.
> >
> > For my next step, I would appreciate if somebody direct me to a right
> path.
> > I am planning to process text clinical documents through the entire
> > pipeline to generate xml output. I see the website suggest walking
> through
> > the Default Clinical Pipeline. I understand there are also multiple git
> > repositories on developed command line tool based Apache cTAKES.
> > My final goal is to integrate cTAKES with some Python packages( OCR,
> etc.)
> > into one pipeline and have some form of web service at the end. I would
> > deeply appreciate any suggestions.
> >
> > Thanks,
> > Maral
> >
>

Re: cTAKES Pipeline [EXTERNAL]

Reply via email to