UIMA-AS was created to handle the message passing, job distribution, etc. Try going through the UIMA-AS documentation first. We have had pretty good success using it here.
Thanks, Thomas Ginter 801-448-7676 thomas.gin...@utah.edu On Apr 27, 2012, at 1:35 PM, John David Osborne wrote: > Hello, > > Is there any best practice documentation out there for running > UIMA/UIMA-AS on a cluster? I have only run single machine instances of > UIMA (mostly through Eclipse) and have not investigated the ability to > perform multiple simultaneous analyses in order to process large document > collections. > > It's not clear to me how UIMA would operate in a cluster environment, do > people really do message passing using JMI? I'm guessing this is the case > as I seeing references to MPICH, SGE or other things I am more used to. > I've looked through some of the documentation (including all the Overview > & SDK setup) but am not finding anything helpful. I've also tried googling > but I am not getting much except this: > http://comments.gmane.org/gmane.comp.apache.uima.general/2131 which makes > me think it is possible. > > Currently with my level of confusion I think it may be best to have > multiple instances of UIMA on a cluster and just submit jobs processing > discrete document sets to our SGE cluster and ignore whatever scaling > features are actually present in UIMA since the document processing I plan > to do is data parallel. > > -John >