UIMA-AS was created to handle the message passing, job distribution, etc.  Try 
going through the UIMA-AS documentation first.  We have had pretty good success 
using it here.

Thanks,

Thomas Ginter
801-448-7676
thomas.gin...@utah.edu




On Apr 27, 2012, at 1:35 PM, John David Osborne wrote:

> Hello,
> 
> Is there any best practice documentation out there for running
> UIMA/UIMA-AS on a cluster? I have only run single machine instances of
> UIMA (mostly through Eclipse) and have not investigated the ability to
> perform multiple simultaneous analyses in order to process large document
> collections.
> 
> It's not clear to me how UIMA would operate in a cluster environment, do
> people really do message passing using JMI? I'm guessing this is the case
> as I seeing references to MPICH, SGE or other things I am more used to.
> I've looked through some of the documentation (including all the Overview
> & SDK setup) but am not finding anything helpful. I've also tried googling
> but I am not getting much except this:
> http://comments.gmane.org/gmane.comp.apache.uima.general/2131 which makes
> me think it is possible.
> 
> Currently with my level of confusion I think it may be best to have
> multiple instances of UIMA on a cluster and just submit jobs processing
> discrete document sets to our SGE cluster and ignore whatever scaling
> features are actually present in UIMA since the document processing I plan
> to do is data parallel.
> 
> -John
> 

Reply via email to