Hi Ben,

each annotator can implement collectionProcessComplete().

Quoting from the documentation:
"The framework calls the collectionProcessComplete() method at the end of the 
collection (i.e., when all objects in the collection have been processed). At 
this point in time, no CAS is passed in as a parameter. This gives the CAS 
Consumer or Analysis Engine an opportunity to perform collection processing 
over the entire set of objects in the collection."

In our implementation of tf.idf, we have an annotator collect the tf score for 
each document in process() and computes the idf part in 
collectionProcessComplete().

-Torsten

On 10.10.18, 17:21, "Benedict Holland" <benedict.m.holl...@gmail.com> wrote:

    Hello all,
    
    I continue to have a problem that comes up a lot. I have a collection
    processing engine. I want something to run after all of the processing is
    done. For example, I have a collection of texts and want to run a tf-idf. I
    generate a tf for each document and at the end, I generate an idf over the
    collection. I can't put that in an annotator as part of my
    processing pipeline.
    
    Is there an aggregate annotator that will run after the entire collection
    is processed?
    
    Thanks,
    ~Ben
    

Reply via email to