Getting annotations from CASes 'external' to a pipeline

Eric Riebling Thu, 15 Mar 2012 08:07:05 -0700

I have a pipeline with it's own type system.
I also have deserialized, annotated CASes on disk with a different type system.
Suppose I want an Analysis Engine in the pipeline to read in the deserialized
CASes in order to obtain annotations and 'do things with them'


I understand some limitations in the UIMA framework prevent this, but
could it be done by making the first type system include that of the
CASes to deserialize?

Also, it would necessitate creating new CASes within the Analysis Engine.
I could think of several approaches, and have tried some without success:

 * Create a new, 'temporary' View in the AE's process() method, obtain a
        JCas, obtain it's CAS, and use that to store the deserialized CASes
   (seems to mangle the original CAS and break downstream AEs in the pipeline,
        and seems to not be able to find any annotations in the deserialized 
CAS)

 * Use the CAS in the process() method to store the deserialized CASes
        (also mangles the original CAS, breaks downstream AEs, but DOES
        permit obtaining annotations from the deserialized CASes)

 * Make the Analysis Engine be a CAS Multiplier, and deserialize into
        a CAS created with createEmtpyCas()
        (I haven't tried this yet)

It's kind of a use case for a hybrid Component that behaves in some ways like
an AE (has a process() method), in some ways like XMI Collection Reader, and
in some ways like a CAS Multiplier.

But it's a useful use case!  It is also a very bizarre one becuase you could
almost think of it as a pipeline within a pipeline, which processes a set
of deserialized annotated XMI documents, within a pipeline that processes ...
in our case, a Question Answering system with question keyterms,
ranked lists of documents and answer candidates.

Getting annotations from CASes 'external' to a pipeline

Reply via email to