Thank you, everyone, for looking at this,
My project is to understand how to use ctakes well and make it flexible for
it to be used for everyone else in our lab when I leave. The first thing I
wanted to do was just be able to get inputs and outputs. As I understand it
more, I want to be able to transform it after. What are the full
capabilities of piper files? When would it be advantageous to just use that
over what I was doing before?

Thank you,
Sebastien Boussard

On Mon, Aug 12, 2019 at 8:21 AM Finan, Sean <
[email protected]> wrote:

> Hi all,
>
> I think that there are a lot of things going on here.
>
> Jeff's question is on point - do you actually have the dictionary?
>
> I think that doing all of this with code is unnecessary.
> - I don't see anything in the code that cannot be done in a piper file.
> - Piper files can set the collection reader.   Use the "reader" command.
> For your use, that would be " reader LinesFromFileCollectionReader
> InputFileName=<filePath> "
> - Piper files can load other piper files.  Use the "load" command.
> For your use, that would be " load DefaultFastPipeline "
>   https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files
> So, instead of writing and debugging code, you can create a 2 line piper
> file and just run it using org.apache.ctakes.core.pipeline.PiperFileRunner
>
> " <java etc.> PiperFileRunner
>  -p <pathToMyPiper>
> -o <pathToMyOutputDir>
> --user <myUmlsUsername>
> --pass <myUmlsPassword> "
>
> Or if you really want to run the piper from code then you can do so, but I
> would rely more upon the piper such as in the examples code
> HelloWorldPiperRunner.java
>
> I would just use a piper file.  If you want to get fancy, then instead of
> explicitly specifying the InputFileName in the piper, use the "cli" command
> in the piper.
> " cli InputFileName=in "
> Then you can remove the specification from the piper command ( simplify it
> to " reader LinesFromFileCollectionReader " )
> and your PiperFileRunner would be the same as above but with "--in
> <filePath> " added.
> Then you can change the input using the command line instead of constantly
> editing the piper.
>
> Besides the obvious simplicity for the user of only using a piper file, it
> should be easier for others to assist with problems as they do not need to
> go through your code.
>
> I have to ask why you are using LinesFromCollectionReader ?  It treats
> each line like a different document.
> Your first attempt points to "right_knee_arthroscopy" in the example
> notes.
> This would give you two output documents, one for each line in that file.
> Is that your intention?
>
>
> Sean
>
>
> ________________________________________
> From: Jeffrey Miller <[email protected]>
> Sent: Saturday, August 10, 2019 2:36 PM
> To: [email protected]
> Subject: Re: Struggling initializing [EXTERNAL]
>
> Sebastien,
>
> Just wanted to confirm that you have the sno_rx_16ab.script file
> in org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab/
>
>
> Jeff
>
> On Sat, Aug 10, 2019, 2:16 PM gandhi rajan <[email protected]>
> wrote:
>
> > Sorry Sebastien I still don't get what you are trying to do.
> >
> > On Saturday, August 10, 2019, Sebastien Boussard <[email protected]>
> wrote:
> >
> > > Hello Mr. Rajan,
> > > I have realized that I have sent you no context! I am currently working
> > on
> > > the Process Lines Clinical Runner. Previously, I was having many errors
> > > with the directories. I made a link from my resources folder to the
> > apache
> > > takes resources folder. I have no link between the source code and the
> > user
> > > interface.
> > >
> > > Here is the code:
> > >
> > > import java.io.File;
> > > import java.io.IOException;
> > >
> > >
> > > import org.apache.ctakes.core.cr.LinesFromFileCollectionReader;
> > > import org.apache.ctakes.core.pipeline.EntityCollector;
> > > import org.apache.ctakes.core.pipeline.PipelineBuilder;
> > > import org.apache.ctakes.core.pipeline.PiperFileReader;
> > > import org.apache.ctakes.core.resource.FileLocator;
> > > import org.apache.ctakes.dictionary.lookup2.ae
> .DefaultJCasTermAnnotator;
> > > import org.apache.uima.UIMAException;
> > > import org.apache.log4j.Logger;
> > > final public class ClinicalProcessor {
> > >
> > >
> > >                 static private final Logger LOGGER = Logger.getLogger("
> > > ClinicalProcessor");
> > >
> > >                 static private final  String PIPER_FILE_PATH =
> > > "/Users/sboussard/Desktop/apache-ctakes-4.0.0/resources/
> > > org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper";
> > >
> > >                 static private final String INPUT_FILE_PATH =
> > > "/Users/sboussard/Desktop/apache-ctakes-4.0.0/resources/
> > > org/apache/ctakes/examples/notes/right_knee_arthroscopy";
> > >
> > >                 private ClinicalProcessor() {
> > >                 }
> > >
> > >                  public static void main( final String[] args ) {
> > >                          System.out.println(PIPER_FILE_PATH);
> > >
> > >                       try {
> > >                          // Create a piper file reader, but don't load
> > the
> > > piper yet - we want to create a reader with parameters
> > >                          final PiperFileReader reader = new
> > > PiperFileReader();
> > >                          final PipelineBuilder builder =
> > > reader.getBuilder();
> > >                          // Add the Lines from File reader
> > >                          //final File inputFile =
> FileLocator.locateFile(
> > > INPUT_FILE_PATH );
> > >                          //final File inputFile = FileLocator.getFile(
> > > INPUT_FILE_PATH );
> > >                          final File inputFile = new
> > File("/Users/sboussard/
> > > Desktop/ClampMac_1.6.0/workspace/MyPipeline/clamp-
> > > ner/Data/Input/sample_2788.txt");
> > >                          builder.reader( LinesFromFileCollectionReader.
> > > class,
> > >
> > LinesFromFileCollectionReader.PARAM_INPUT_FILE_NAME,
> > > inputFile.getAbsolutePath() );
> > >                          // Add the lines from the piper file
> > >                          reader.loadPipelineFile( PIPER_FILE_PATH );
> > >                          // Collect IdentifiedAnnotation object
> > > information for output - simple for examples
> > >                          builder.collectEntities();
> > >                          // Run the pipeline with specified text
> > >                          builder.run();
> > >                          // Log the IdentifiedAnnotation object
> > information
> > >                          LOGGER.info( "\n" +
> > EntityCollector.getInstance().toString()
> > > );
> > >                       } catch ( IOException | UIMAException multE ) {
> > >                          LOGGER.error( multE.getMessage() );
> > >                       }
> > >                    }
> > >
> > >
> > >                 }
> > >
> > > Thank you for all your help,
> > > Sebastien Boussard
> > >
> > > > On Aug 10, 2019, at 3:00 AM, gandhi rajan <[email protected]>
> > > wrote:
> > > >
> > > > As far as I know, it's a more generic error. Could you please let us
> > know
> > > > what action you are trying to perform and steps involved in
> reproducing
> > > the
> > > > issue.
> > > >
> > > > On Saturday, August 10, 2019, Sebastien Boussard <[email protected]>
> > > wrote:
> > > >
> > > >> Hello,
> > > >> I’m an intern in the Stanford Biomedical Informatics Lab and I've
> been
> > > >> working on getting a ctakes page for a week, and I’ve been getting a
> > > lot of
> > > >> errors. I have been getting a filed to initialize error for the last
> > day
> > > >> and a half and I can not solve it. I will send you the whole log, if
> > you
> > > >> can help me out it would be greatly appreciated.
> > > >>
> > > >> log4j: reset attribute= "false".
> > > >> log4j: Threshold ="null".
> > > >> log4j: Retreiving an instance of org.apache.log4j.Logger.
> > > >> log4j: Setting [ProgressAppender] additivity to [false].
> > > >> log4j: Level value for ProgressAppender is  [INFO].
> > > >> log4j: ProgressAppender level set to INFO
> > > >> log4j: Class name: [org.apache.log4j.ConsoleAppender]
> > > >> log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
> > > >> log4j: Setting property [conversionPattern] to [%m].
> > > >> log4j: Adding appender named [noEolAppender] to category
> > > >> [ProgressAppender].
> > > >> log4j: Retreiving an instance of org.apache.log4j.Logger.
> > > >> log4j: Setting [ProgressDone] additivity to [false].
> > > >> log4j: Level value for ProgressDone is  [INFO].
> > > >> log4j: ProgressDone level set to INFO
> > > >> log4j: Class name: [org.apache.log4j.ConsoleAppender]
> > > >> log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
> > > >> log4j: Setting property [conversionPattern] to [%m%n].
> > > >> log4j: Adding appender named [eolAppender] to category
> [ProgressDone].
> > > >> log4j: Level value for root is  [INFO].
> > > >> log4j: root level set to INFO
> > > >> log4j: Class name: [org.apache.log4j.ConsoleAppender]
> > > >> log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
> > > >> log4j: Setting property [conversionPattern] to [%d{dd MMM yyyy
> > HH:mm:ss}
> > > >> %5p %c{1} - %m%n].
> > > >> log4j: Adding appender named [consoleAppender] to category [root].
> > > >> /Users/sboussard/Desktop/apache-ctakes-4.0.0/resources/
> > > >> org/apache/ctakes/clinical/pipeline/DefaultFastPipeline.piper
> > > >> 09 Aug 2019 11:28:50  INFO SentenceDetector - Sentence detector
> model
> > > >> file: org/apache/ctakes/core/sentdetect/sd-med-model.zip
> > > >> 09 Aug 2019 11:28:50  INFO TokenizerAnnotatorPTB - Initializing
> > > >> org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
> > > >> 09 Aug 2019 11:28:50  INFO ContextDependentTokenizerAnnotator -
> Finite
> > > >> state machines loaded.
> > > >> 09 Aug 2019 11:28:50  INFO POSTagger - POS tagger model file:
> > > >> org/apache/ctakes/postagger/models/mayo-pos.zip
> > > >> 09 Aug 2019 11:28:51  INFO Chunker - Chunker model file:
> > > >> org/apache/ctakes/chunker/models/chunker-model.zip
> > > >> 09 Aug 2019 11:28:52  INFO AbstractJCasTermAnnotator - Using
> > dictionary
> > > >> lookup window type:
> > org.apache.ctakes.typesystem.type.textspan.Sentence
> > > >> 09 Aug 2019 11:28:52  INFO AbstractJCasTermAnnotator - Exclusion
> > tagset
> > > >> loaded: CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD
> VBG
> > > VBN
> > > >> VBP VBZ WDT WP WPS WRB
> > > >> 09 Aug 2019 11:28:52  INFO AbstractJCasTermAnnotator - Using minimum
> > > term
> > > >> text span: 3
> > > >> 09 Aug 2019 11:28:52  INFO AbstractJCasTermAnnotator - Using
> > Dictionary
> > > >> Descriptor: org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml
> > > >> 09 Aug 2019 11:28:52  INFO DictionaryDescriptorParser - Parsing
> > > dictionary
> > > >> specifications:
> > > >> 09 Aug 2019 11:28:52  INFO UmlsUserApprover - Checking UMLS Account
> at
> > > >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__uts-2Dws.nlm.nih.gov_restful_isValidUMLSUser&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=5TDSCM85vULZYZcSTh2NL3qaVFV_2sJkBfV7zPV4StI&s=7C7YUGjMyzZq1eabffg_1uxCewLyf619heJ6Xbm84aQ&e=
> for user boussard:
> > > >> ..09 Aug 2019 11:28:53  INFO UmlsUserApprover -   UMLS Account at
> > > >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__uts-2Dws.nlm.nih.gov_restful_isValidUMLSUser&d=DwIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=5TDSCM85vULZYZcSTh2NL3qaVFV_2sJkBfV7zPV4StI&s=7C7YUGjMyzZq1eabffg_1uxCewLyf619heJ6Xbm84aQ&e=
> for user boussard
> > > has
> > > >> been validated
> > > >>
> > > >> 09 Aug 2019 11:28:53 ERROR ClinicalProcessor - Initialization of
> > > annotator
> > > >> class "org.apache.ctakes.dictionary.lookup2.ae.
> > > DefaultJCasTermAnnotator"
> > > >> failed.  (Descriptor: <unknown>)
> > > >>
> > > >>
> > > >
> > > > --
> > > > Regards,
> > > > Gandhi
> > > >
> > > > "The best way to find urself is to lose urself in the service of
> others
> > > !!!"
> > >
> > >
> >
> > --
> > Regards,
> > Gandhi
> >
> > "The best way to find urself is to lose urself in the service of others
> > !!!"
> >
>

Reply via email to