I'm using UIMA AS 2.4.0, and have an example pipeline with 3 annotators. The 
third annotator is coded to just sleep for 3 seconds per document to simulate a 
slow annotator.

If I change the pipeline to async=true and set the number of scale out 
instances on the slow annotator to be 6, I expected the pipeline to be about 6 
times faster. What I see, however, is exactly the same performance.

A bit of debugging shows UIMA AS is creating 6 different copies of the slow 
annotator, because each one is being called alternately per CAS, but it is 
waiting for the entire pipeline to be complete before getting another cas off 
the queue.

Any ideas what may be misconfigured? Or what to look at?

My deployment descriptor is:

<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDeploymentDescription 
xmlns="http://uima.apache.org/resourceSpecifier";>
    <name>defaultFlapDeployDescriptor20130807.095936</name>
    <description/>
    <version>1.0</version>
    <vendor/>
    <deployment protocol="jms" provider="activemq">
        <casPool numberOfCASes="6" initialFsHeapSize="2000000"/>
        <service>
            <inputQueue endpoint="exampleQueue" 
brokerURL="tcp://localhost:61616" prefetch="0"/>
            <topDescriptor>
                <import 
location="file:/var/folders/vl/7p2qch6j4kx_kv5chvd093l80000gn/T/flapAggregate311122232121092424.xml"/>
            </topDescriptor>
            <analysisEngine async="true">
                <scaleout numberOfInstances="1"/>
                <delegates>
                    <analysisEngine 
key="aeWhitespaceTokenizerDescriptor211289c8cf04-b67c-45e2-a1eb-e90a85f39006" 
async="false">
                        <scaleout numberOfInstances="1"/>
                        <asyncAggregateErrorConfiguration>
                            <getMetadataErrors maxRetries="0" timeout="0" 
errorAction="terminate"/>
                            <processCasErrors thresholdCount="0" 
thresholdWindow="0" thresholdAction="terminate"/>
                            <collectionProcessCompleteErrors timeout="0" 
additionalErrorAction="terminate"/>
                        </asyncAggregateErrorConfiguration>
                    </analysisEngine>
                    <analysisEngine 
key="aeWordTokenizerDescriptor21126d2902a3-e6ca-4834-89cb-ec1a6c29f281" 
async="false">
                        <scaleout numberOfInstances="1"/>
                        <asyncAggregateErrorConfiguration>
                            <getMetadataErrors maxRetries="0" timeout="0" 
errorAction="terminate"/>
                            <processCasErrors thresholdCount="0" 
thresholdWindow="0" thresholdAction="terminate"/>
                            <collectionProcessCompleteErrors timeout="0" 
additionalErrorAction="terminate"/>
                        </asyncAggregateErrorConfiguration>
                    </analysisEngine>
                    <analysisEngine 
key="gov.va.vinci.flap.examples.ae.MySlowAnnotator2112fc3e83f1-f535-40c2-a860-895207bfff1a"
 async="false">
                        <scaleout numberOfInstances="6"/>
                        <asyncAggregateErrorConfiguration>
                            <getMetadataErrors maxRetries="0" timeout="0" 
errorAction="terminate"/>
                            <processCasErrors thresholdCount="0" 
thresholdWindow="0" thresholdAction="terminate"/>
                            <collectionProcessCompleteErrors timeout="0" 
additionalErrorAction="terminate"/>
                        </asyncAggregateErrorConfiguration>
                    </analysisEngine>
                </delegates>
                <asyncPrimitiveErrorConfiguration>
                    <processCasErrors thresholdCount="0" thresholdWindow="0" 
thresholdAction="terminate"/>
                    <collectionProcessCompleteErrors timeout="0" 
additionalErrorAction="terminate"/>
                </asyncPrimitiveErrorConfiguration>
            </analysisEngine>
        </service>
    </deployment>
</analysisEngineDeploymentDescription>

Thanks!
Ryan


Reply via email to