Hi Maral, You can generate different output types by adding different writers to the end of the pipeline. Here are the contents of the Default Clinical Pipeline piper file:
======================================================================================== // Commands and parameters to create a default plaintext document processing pipeline with UMLS lookup // Load a simple token processing pipeline from another pipeline file load DefaultTokenizerPipeline // Add non-core annotators add ContextDependentTokenizerAnnotator addDescription POSTagger // Add Chunkers load ChunkerSubPipe // Default fast dictionary lookup load DictionarySubPipe // Add Cleartk Entity Attribute annotators load AttributeCleartkSubPipe ======================================================================================== I recommend that you copy those lines to a new file (for instance, Maral.piper) and then add the following lines: ======================================================================================== // Write marked copy of note text in interactive html files add pretty.html.HtmlTextWriter SubDirectory=HTML // Write Fast Health Interoperability Resources (FHIR) json files. fhir.org package org.apache.ctakes.fhir.cc add FhirJsonFileWriter SubDirectory=FHIR // Write plaintext copy of note text with cui, semantic group, POS. Relations are listsed. add pretty.plaintext.PrettyTextWriterFit SubDirectory=TEXT // Write plaintext copy of note sentences with entity and relation disveries listed. add property.plaintext.PropertyTextWriterFit SubDirectory=PROP ======================================================================================== The output directory should then contain some new output in different subdirectories. You can change the subdirectory names. Note: the "=================================" are just there to indicate what is for the file. Do not copy them. There are many more file writers, most of which write simple lists of discoveries in one form or another. I recommend trying the 4 above and see if any fit your purposes before moving on to more specialized writers. Sean ________________________________________ From: Maral Amir <[email protected]> Sent: Thursday, July 18, 2019 7:11 PM To: [email protected] Subject: Re: cTAKES Pipeline [EXTERNAL] Hi Sean, Thank you so much for your very helpful and comprehensive response. I was able to generate the xmi results in the output directory and used UIMA Cas Visual Debugger (CVD) as suggested to view the information. I have two questions: 1. What is the best reference for me to study and understand the annotations. 2. Is there a CLI equivalent to CVD? I need the annotated outputs in a readable format without the help of CVD. Thanks, Maral On Thu, Jul 18, 2019 at 12:52 PM Finan, Sean < [email protected]> wrote: > Hi Maral, > > This might be what you are talking about with respect to the Default > Clinical Pipeline > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=cBb87McNP4vp678BVVM6z9Wwfr_CQNb--5XKAUPDxYM&e= > > That lists a command line method for running a set of files and getting > xml output. > > The default clinical pipeline configuration is actually contained in the > plain text (piper) file > resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper > > If you are looking at source code then the file is > ctakes-clinical-pipeline-res/src/main/resources/ ... > > You can also select and run a piper file with a gui > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=lTtwFsqMJEl1M73fifRpWrO6BZX_R0d2gh3HOqvAx90&e= > > Both methods are mentioned near the bottom of one of the pages detailing > pipeline configuration > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=0VYZQYTmgYmbRW_vsbf8XACzsVWdetpqSxeDj_c8RKA&e= > > There are several example pipelines constructed with code and/or plain > text files in the ctakes-examples and ctakes-examples-res modules. You can > look at the different "Hello World" examples. > > Since you are playing with maven, you can run the profile "runPiperGui". > mvn clean compile -DskipTests -PrunPiperGui > > Sean > > > ________________________________________ > From: Maral Amir <[email protected]> > Sent: Thursday, July 18, 2019 2:29 PM > To: [email protected] > Subject: cTAKES Pipeline [EXTERNAL] > > Hi, > > I just build my developer version of cTAKES with the help of wonderful > cTAKES developers. > > For my next step, I would appreciate if somebody direct me to a right path. > I am planning to process text clinical documents through the entire > pipeline to generate xml output. I see the website suggest walking through > the Default Clinical Pipeline. I understand there are also multiple git > repositories on developed command line tool based Apache cTAKES. > My final goal is to integrate cTAKES with some Python packages( OCR, etc.) > into one pipeline and have some form of web service at the end. I would > deeply appreciate any suggestions. > > Thanks, > Maral >
