Re: cTAKES Pipeline [EXTERNAL]

Finan, Sean Fri, 19 Jul 2019 06:13:51 -0700

Hi Maral,

You can generate different output types by adding different writers to the end 
of the pipeline.
Here are the contents of the Default Clinical Pipeline piper file:


========================================================================================
// Commands and parameters to create a default plaintext document processing 
pipeline with UMLS lookup

// Load a simple token processing pipeline from another pipeline file
load DefaultTokenizerPipeline

// Add non-core annotators
add ContextDependentTokenizerAnnotator
addDescription POSTagger

// Add Chunkers
load ChunkerSubPipe

// Default fast dictionary lookup
load DictionarySubPipe

// Add Cleartk Entity Attribute annotators
load AttributeCleartkSubPipe
========================================================================================


I recommend that you copy those lines to a new file (for instance, Maral.piper) 
and then add the following lines:

========================================================================================
// Write marked copy of note text in interactive html files
add pretty.html.HtmlTextWriter SubDirectory=HTML

// Write Fast Health Interoperability Resources (FHIR) json files.  fhir.org
package org.apache.ctakes.fhir.cc
add FhirJsonFileWriter SubDirectory=FHIR

// Write plaintext copy of note text with cui, semantic group, POS.  Relations 
are listsed.
add pretty.plaintext.PrettyTextWriterFit SubDirectory=TEXT

// Write plaintext copy of note sentences with entity and relation disveries 
listed.
add property.plaintext.PropertyTextWriterFit SubDirectory=PROP
========================================================================================


The output directory should then contain some new output in different 
subdirectories.  You can change the subdirectory names.

Note: the "=================================" are just there to indicate what 
is for the file.  Do not copy them.

There are many more file writers, most of which write simple lists of 
discoveries in one form or another.  
I recommend trying the 4 above and see if any fit your purposes before moving 
on to more specialized writers.

Sean

________________________________________
From: Maral Amir <[email protected]>
Sent: Thursday, July 18, 2019 7:11 PM
To: [email protected]
Subject: Re: cTAKES Pipeline [EXTERNAL]

Hi Sean,

Thank you so much for your very helpful and comprehensive response. I was
able to generate the xmi results in the output directory and used UIMA Cas
Visual Debugger (CVD) as suggested to view the information. I have two
questions:
1. What is the best reference for me to study and understand the
annotations.
2. Is there a CLI equivalent to CVD? I need the annotated outputs in a
readable format without the help of CVD.

Thanks,
Maral


On Thu, Jul 18, 2019 at 12:52 PM Finan, Sean <
[email protected]> wrote:

> Hi Maral,
>
> This might be what you are talking about with respect to the Default
> Clinical Pipeline
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=cBb87McNP4vp678BVVM6z9Wwfr_CQNb--5XKAUPDxYM&e=
>
> That lists a command line method for running a set of files and getting
> xml output.
>
> The default clinical pipeline configuration is actually contained in the
> plain text (piper) file
> resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper
>
> If you are looking at source code then the file is
> ctakes-clinical-pipeline-res/src/main/resources/ ...
>
> You can also select and run a piper file with a gui
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=lTtwFsqMJEl1M73fifRpWrO6BZX_R0d2gh3HOqvAx90&e=
>
> Both methods are mentioned near the bottom of one of the pages detailing
> pipeline configuration
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=0VYZQYTmgYmbRW_vsbf8XACzsVWdetpqSxeDj_c8RKA&e=
>
> There are several example pipelines constructed with code and/or plain
> text files in the ctakes-examples and ctakes-examples-res modules.  You can
> look at the different "Hello World" examples.
>
> Since you are playing with maven, you can run the profile "runPiperGui".
> mvn clean compile -DskipTests -PrunPiperGui
>
> Sean
>
>
> ________________________________________
> From: Maral Amir <[email protected]>
> Sent: Thursday, July 18, 2019 2:29 PM
> To: [email protected]
> Subject: cTAKES Pipeline [EXTERNAL]
>
> Hi,
>
> I just build my developer version of cTAKES with the help of wonderful
> cTAKES developers.
>
> For my next step, I would appreciate if somebody direct me to a right path.
> I am planning to process text clinical documents through the entire
> pipeline to generate xml output. I see the website suggest walking through
> the Default Clinical Pipeline. I understand there are also multiple git
> repositories on developed command line tool based Apache cTAKES.
> My final goal is to integrate cTAKES with some Python packages( OCR, etc.)
> into one pipeline and have some form of web service at the end. I would
> deeply appreciate any suggestions.
>
> Thanks,
> Maral
>

Re: cTAKES Pipeline [EXTERNAL]

Reply via email to