UIMA Ruta tutorial slides

2013-10-10 Thread Peter Klügl
Hi, the slides of the GSCL tutorial part II about UIMA Ruta are now available online: http://uima.apache.org/downloads/gscl2013/2013-GSCL-Ruta.pdf Best, Peter

Re: Designing collection readers: Reading multiple XML files containing multiple CASes

2013-10-10 Thread Richard Eckart de Castilho
The CPEBuilder from uimaFIT dissasembles the top-level AAE you put into it and turns the AEs inside into separate CPE-level components. This is to allow AEs to be run in parallel while CCs are run single-threaded. If you want to run a pipeline with a CAS Multipier in the CPE, then you need to wra

Re: Designing collection readers: Reading multiple XML files containing multiple CASes

2013-10-10 Thread Jens Grivolla
It sounds to me like it would be much easier to just have a custom collection reader that outputs one CAS per document (i.e. multiple CASes per input file), rather than having a CR that outputs one CAS per file (with just metadata) plus an additional AE to generate the "real" CASes from there.

Re: Designing collection readers: Reading multiple XML files containing multiple CASes

2013-10-10 Thread Swirl
> > For part c: > > I imagine an algorithm that can scan the main XML file and find the "sections". > For each section it finds, it can produce a CAS and initialize that CAS with the > section's information. > > If this algorithm lives inside an analysis component, then it can use the "CAS >