[ 
https://issues.apache.org/jira/browse/UIMA-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624968#action_12624968
 ] 

Eddie Epstein commented on UIMA-1146:
-------------------------------------

.bq I'm a little concerned that we are adding a lot of parameters that may be 
tricky to tune. Presumably most applications will take the default but for 
those that may benefit from scaleout it may be hard to determine which queue 
needs the extra threads. 

There are a lot of parameters which can be tuned with UIMA AS, but as you note, 
most applications will not need to. The real issue is to know when and how to 
tune them. Turns out that "when" is pretty easy to answer: is the CPU fully 
utilized? If not, the monitor function (see UIMA-1104) has been very successful 
in identifying bottlenecks in order to answer "how".

.bq For an aggregate with N remote delegates and at least 1 co-located we'll 
have N+2 parameters to consider. Since these 3 types of queue consumers are all 
doing similar work, dispatching the CASes within the aggregate, wouldn't it be 
nice if we could have only 1 scaleout parameter for each aggregate.

.bq   <analysisEngine async="true" scaleout="n">

.bq This would let users focus more on the aggregate as a dispatcher of work, 
and less on the subtleties of our implementation.

When used with Uima AS primitives, "scaleout" means to replicate instances of 
analysis components. Overloading "scaleout" to refer to replicating some 
threads in an aggregate will result in unnecessary confusion.

.bq We could support this by using a Java queue for all replies (not just 
co-located ones as proposed elsewhere) and scale out the dispatch threads that 
consume this queue. The input queue and each remote queue would still have 
dedicated threads, but they would merely put the message on the Java queue 
along with the appropriate CAS, after blocking on the CasPool if necessary.

.bq One drawback is the extra thread switch but this may be small compared with 
the transport time for a remote. Another is that the extra consumers of remote 
queues are expected to replace the need for prefetch, but since this proposal 
separates the fetching from the dispatching the lightweight fetch thread may 
achieve the same effect as pretetch. 

Individual remote reply queues still need to be scaled (see UIMA-1130), so we 
need exactly what Marshall has proposed above. UIMA-1140 advocates using a Java 
queue for colocated delegates. When that gets done we can consider if piping 
all replies through the java queue makes sense.


> Setting the number of concurrent listeners of a reply queue for Co-located 
> Delegates
> ------------------------------------------------------------------------------------
>
>                 Key: UIMA-1146
>                 URL: https://issues.apache.org/jira/browse/UIMA-1146
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Async Scaleout
>    Affects Versions: 2.2.2AS
>            Reporter: Tong Fin
>            Assignee: Tong Fin
>
> JIRA-1130 has improved UIMA-AS to allow users to set the number of concurrent 
> listeners of a reply queue for  each "remote" delegate. The following is the 
> syntax in the xml deployment descriptor (as an example):
>       <analysisEngine async="true">
>         <delegates>
>           <remoteAnalysisEngine key="RoomNumber">
>             <inputQueue brokerURL="tcp://localhost:61616" 
> endpoint="RoomNumberAnnotatorQueue"/>
>             <replyQueue concurrentConsumers="2" location="remote"/>
>             ...
>           </remoteAnalysisEngine>
>         </delegates>
>         ...
>       </analysisEngine>
> This JIRA will do the similar thing by allowing users to set the number of 
> concurrent listeners of a reply queue for  "co-located" delegates inside the 
> UIMA-AS aggregate. 
> The following is the "proposed" syntax:
>       <analysisEngine async="true"> <!-- Top aggregate -->
>         <replyQueue concurrentConsumers="2">
>         ...
>         <delegates>
>           <analysisEngine key="NamesAndPersonTitlesTAE" async="true"> <!-- 
> co-located aggregate -->
>             <replyQueue concurrentConsumers="3">
>             ...
>           </analysisEngine>
>           ...
>         </delegates>
>         ...
>       </analysisEngine>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to