Re: Question about the pipeline

2015-02-02 Thread Tol O .
Maite Meseure Hugues meseure.maite@... writes:

 
 Hello all,
 
 Thank you for your preceding answers.
 I have a few questions regarding the pipeline example to run cTakes
 programmatically.
 I am running ExampleAggregatePipeline.java with ExampleHelloWorldAnnotator
 but I would like to know how I can change it to run my data, as the CPE
 where we can choose the directory of our data.
 My second question is about the xml output generated with the CPE, can I
 get the same xml output in using the example pipeline? and How?
 Thanks for your time.


I would like to ask the same question. After successfully setting up CTAKES
following the Developers Guide I would also like to use a modified 
ExampleAggregatePipeline to output a CAS file identical to the output
obtained by the CPE or the CVD when following the Users Guide.

This would be a great help for developers as a starting class to be able to
programmatically obtain an annotated file based on a plaintext or XML input,
same as through the two GUIs.

Right now I am reading through the Component Use Guide to replicate the CPE
or the CVD tutorial with the test input, but it is a bit overwhelming.

Any pointers or suggestions would be really appreciated.

Tol O.



RE: Question about the pipeline

2015-02-02 Thread Finan, Sean
Hi Tol (and Maite),

I'm not entirely certain that I understand the question, but here is an attempt 
to help.  If I'm oversimplifying then I apologize.

I think that ExampleAggregatePipeline is intended to represent a very simple 
single-note pipeline and that custom code could be produced by using it as an 
example.

If you want to process texts in a directory, you can find with a web search 
plenty of ways to list files in a directory and read text from files.  
org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader might be what you 
used in the CPE, and you can certainly peruse the code and take what you need.  
Or, if you decide to write a simple diy,  here is one possibility:

Static public CollectionFile getFilesInDir( final File directory ) {
   final CollectionFile fileList = new ArrayList();
   final File[] fileList = directory.listFiles();
   if ( fileList == null ) {
  System.err.println( please check the directory  + 
directory.getAbsolutePath() );
  System.exit( 1 );
   }
for ( final File file : directory.listFiles() ) {
if ( file.canRead() ) {
fileList.add( file );
}
}
} 

Static public String getTextInFile( final File file ) throws IOException {   -- 
or handle ioE herein
   final Path nioPath = file.toPath();
   return new String( Files.readAllBytes( nioPath ) );
}

Static public void main( String ... args ) {
   If ( args[0].isEmpty() ) {
  System.out.println( Enter a directory path );
  System.exit( 0 );
   }
   Final CollectionFile files = getFilesInDir( new File( args[0] );
   For ( File file : files ) {
  Final String note = getTextInFile( file );
  ---  Insert here code a' la ExampleAggregatePipeline  ---
  ---  swap out the writer in ExampleAggregatePipeline with CasIOUtil 
method (below)  ---
   }
}

I must admit that I have never directly used it, but there is an xmi file 
writing method in org.apache.uima.fit.util.CasIOUtil named writeXmi( JCas jCas, 
File file ).  You could give this a try and see if it produces the type of 
output that you want.  The same utility class has a writeXCas(..) method.


If the above has absolutely nothing to do with your needs then please send me a 
bulleted list of items, example workflow, etc. and I'll see if I can be of 
service.

Oh, and I wrote the above code freehand, so MS Outlook is adding capital 
letters, etc.  If you cut and paste you'll need to change that - plus I haven't 
run/compiled, so there might be a typo or missed exception or something.  Or it 
may not work (in which case I'll throw in a little more effort).

Sean


-Original Message-
From: Tol O. [mailto:tol...@gmail.com] 
Sent: Monday, February 02, 2015 6:56 PM
To: dev@ctakes.apache.org
Subject: Re: Question about the pipeline

Maite Meseure Hugues meseure.maite@... writes:

 
 Hello all,
 
 Thank you for your preceding answers.
 I have a few questions regarding the pipeline example to run cTakes 
 programmatically.
 I am running ExampleAggregatePipeline.java with 
 ExampleHelloWorldAnnotator but I would like to know how I can change 
 it to run my data, as the CPE where we can choose the directory of our data.
 My second question is about the xml output generated with the CPE, can 
 I get the same xml output in using the example pipeline? and How?
 Thanks for your time.


I would like to ask the same question. After successfully setting up CTAKES 
following the Developers Guide I would also like to use a modified 
ExampleAggregatePipeline to output a CAS file identical to the output obtained 
by the CPE or the CVD when following the Users Guide.

This would be a great help for developers as a starting class to be able to 
programmatically obtain an annotated file based on a plaintext or XML input, 
same as through the two GUIs.

Right now I am reading through the Component Use Guide to replicate the CPE or 
the CVD tutorial with the test input, but it is a bit overwhelming.

Any pointers or suggestions would be really appreciated.

Tol O.