Hi Sean, Thank you so much for your insightful response. I'm having a problem linking the piper files. I should mention I am using command line interface. Could you please kindly let me know:
1. What I should set my CTAKES_HOME variable into. Right now I set my CTAKES_HOME to my cTAKES user installation main folder. That is because I could see in the last line of the runPiperFile.sh, the class directory $CTAKES_HOME/lib/* is included and no /lib folder is present in the developer's version. java -cp $CTAKES_HOME/desc/:$CTAKES_HOME/resources/:$CTAKES_HOME/resources/resources:$CTAKES_HOME/lib/* -Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms512M -Xmx3g org.apache.ctakes.core.pipeline.PiperFileRunner "$@" Also, 2. Where is the *"bin"* folder where the bash file resides. Right now I use this one : /Users/local/projects/ctakes/trunk/ctakes-distribution/src/main/bin Thanks, Maral On Fri, Jul 19, 2019 at 6:13 AM Finan, Sean < [email protected]> wrote: > Hi Maral, > > You can generate different output types by adding different writers to the > end of the pipeline. > Here are the contents of the Default Clinical Pipeline piper file: > > > ======================================================================================== > // Commands and parameters to create a default plaintext document > processing pipeline with UMLS lookup > > // Load a simple token processing pipeline from another pipeline file > load DefaultTokenizerPipeline > > // Add non-core annotators > add ContextDependentTokenizerAnnotator > addDescription POSTagger > > // Add Chunkers > load ChunkerSubPipe > > // Default fast dictionary lookup > load DictionarySubPipe > > // Add Cleartk Entity Attribute annotators > load AttributeCleartkSubPipe > > ======================================================================================== > > > I recommend that you copy those lines to a new file (for instance, > Maral.piper) and then add the following lines: > > > ======================================================================================== > // Write marked copy of note text in interactive html files > add pretty.html.HtmlTextWriter SubDirectory=HTML > > // Write Fast Health Interoperability Resources (FHIR) json files. > fhir.org > package org.apache.ctakes.fhir.cc > add FhirJsonFileWriter SubDirectory=FHIR > > // Write plaintext copy of note text with cui, semantic group, POS. > Relations are listsed. > add pretty.plaintext.PrettyTextWriterFit SubDirectory=TEXT > > // Write plaintext copy of note sentences with entity and relation > disveries listed. > add property.plaintext.PropertyTextWriterFit SubDirectory=PROP > > ======================================================================================== > > > The output directory should then contain some new output in different > subdirectories. You can change the subdirectory names. > > Note: the "=================================" are just there to indicate > what is for the file. Do not copy them. > > There are many more file writers, most of which write simple lists of > discoveries in one form or another. > I recommend trying the 4 above and see if any fit your purposes before > moving on to more specialized writers. > > Sean > > ________________________________________ > From: Maral Amir <[email protected]> > Sent: Thursday, July 18, 2019 7:11 PM > To: [email protected] > Subject: Re: cTAKES Pipeline [EXTERNAL] > > Hi Sean, > > Thank you so much for your very helpful and comprehensive response. I was > able to generate the xmi results in the output directory and used UIMA Cas > Visual Debugger (CVD) as suggested to view the information. I have two > questions: > 1. What is the best reference for me to study and understand the > annotations. > 2. Is there a CLI equivalent to CVD? I need the annotated outputs in a > readable format without the help of CVD. > > Thanks, > Maral > > > On Thu, Jul 18, 2019 at 12:52 PM Finan, Sean < > [email protected]> wrote: > > > Hi Maral, > > > > This might be what you are talking about with respect to the Default > > Clinical Pipeline > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Default-2BClinical-2BPipeline&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=cBb87McNP4vp678BVVM6z9Wwfr_CQNb--5XKAUPDxYM&e= > > > > That lists a command line method for running a set of files and getting > > xml output. > > > > The default clinical pipeline configuration is actually contained in the > > plain text (piper) file > > resources/org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper > > > > If you are looking at source code then the file is > > ctakes-clinical-pipeline-res/src/main/resources/ ... > > > > You can also select and run a piper file with a gui > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFile-2BSubmitter-2BGUI&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=lTtwFsqMJEl1M73fifRpWrO6BZX_R0d2gh3HOqvAx90&e= > > > > Both methods are mentioned near the bottom of one of the pages detailing > > pipeline configuration > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_Piper-2BFiles&d=DwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=WJJB6qjAiCjVDSuwYgcYjXv0EenGbCblnUGl8Rc5V9I&s=0VYZQYTmgYmbRW_vsbf8XACzsVWdetpqSxeDj_c8RKA&e= > > > > There are several example pipelines constructed with code and/or plain > > text files in the ctakes-examples and ctakes-examples-res modules. You > can > > look at the different "Hello World" examples. > > > > Since you are playing with maven, you can run the profile "runPiperGui". > > mvn clean compile -DskipTests -PrunPiperGui > > > > Sean > > > > > > ________________________________________ > > From: Maral Amir <[email protected]> > > Sent: Thursday, July 18, 2019 2:29 PM > > To: [email protected] > > Subject: cTAKES Pipeline [EXTERNAL] > > > > Hi, > > > > I just build my developer version of cTAKES with the help of wonderful > > cTAKES developers. > > > > For my next step, I would appreciate if somebody direct me to a right > path. > > I am planning to process text clinical documents through the entire > > pipeline to generate xml output. I see the website suggest walking > through > > the Default Clinical Pipeline. I understand there are also multiple git > > repositories on developed command line tool based Apache cTAKES. > > My final goal is to integrate cTAKES with some Python packages( OCR, > etc.) > > into one pipeline and have some form of web service at the end. I would > > deeply appreciate any suggestions. > > > > Thanks, > > Maral > > >
