The choice between single or multiple collection readers depends a lot on the application. If populating the initial input CASes is not expensive it could be implemented as a UIMA-AS client similar to runRemoteAsyncAE in figure 3, with the load balancing provided by the multiple service instances consuming CASes from the input queue. If the application has multiple services each handling a stream of job requests then each could be a UIMA-AS client and send CASes to the same input queue.
Note that CasMultipliers are a flexible replacement for Collection Readers since the collection definition can be provided dynamically in the input CAS, rather than in a configuration file or via some side channel. So similar to figure 5 an application could scale out multiple aggregates on a cluster of machines, each aggregate starting with a CasMultiplier that gets its collection definition (a directory or list of documents) from a CAS placed on the shared input queue, and creates the document CASes to be processed by the AEs in the rest of the aggregate. Some of these AEs could be scaled locally, or could be remote AS services which could be shared by all of the scaled out aggregates. In practice it may be sufficient to scale just the delegates inside an AS aggregate, deploying multiple instances of any slow components and providing a CAS pool large enough to keep all of the local and remote delegate instances busy. One advantage of designing an application as a deployed UIMA-AS aggregate with some of its AEs deployed as remote services is that it is relatively easy to start with a simple synchronous single-threaded UIMA aggregate and later add the UIMA-AS deployments and scaleout. ~Burn
