[ 
https://issues.apache.org/jira/browse/HBASE-10169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849883#comment-13849883
 ] 

Gary Helmling commented on HBASE-10169:
---------------------------------------

Sure, the way I see #1 is simply batching the RPCs performed for 
HRegion.execService() invocations for regions on the same regionserver.  In the 
same way that a multi-get will do a single RPC to return results from Get 
requests across multiple regions on a single regionserver, a batched 
coprocessor service request for all the relevant regions on a regionserver 
could return a single response containing multiple CoprocessorServiceResponse 
objects.

The high-level execution would look like:
# on the client (in HTable.coprocessorService()) group regions involved in a 
coprocessorService() request by regionserver
# create a request object per-regionserver containing multiple 
CoprocessorServiceRequest instances (one per region)
# the regionserver would execute the individual requests against each region, 
calling HRegion.execService()
# before returning, the regionserver aggregates the individual responses into a 
single response object containing multiple CoprocessorServiceResponse instances 
(again one per region)
# the client, on receiving the response, invokes Batch.Callback.update() with 
the contents of each CoprocessorServiceResponse

HRegionServer.multi() provides a good model for this, I think.

This could all happen transparently, using the existing 
HTable.coprocessorService() client interface and would be a massive improvement 
in RPC efficiency.

Regarding #2, providing user-defined aggregations on the server-side (or 
"combiners" as described in HBASE-5762) could provide further efficiency 
improvements in limiting response bandwidth for some use-cases, but I think it 
deserves to be looked at on it's own, given that it would create an entirely 
new user-facing API.

> Batch coprocessor
> -----------------
>
>                 Key: HBASE-10169
>                 URL: https://issues.apache.org/jira/browse/HBASE-10169
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Coprocessors
>    Affects Versions: 0.99.0
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: Batch Coprocessor Design Document.docx, HBASE-10169.patch
>
>
> This is designed to improve the coprocessor invocation in the client side. 
> Currently the coprocessor invocation is to send a call to each region. If 
> there’s one region server, and 100 regions are located in this server, each 
> coprocessor invocation will send 100 calls, each call uses a single thread in 
> the client side. The threads will run out soon when the coprocessor 
> invocations are heavy. 
> In this design, all the calls to the same region server will be grouped into 
> one in a single coprocessor invocation. This call will be spread into each 
> region in the server side, and the results will be merged ahead in the server 
> side before being returned to the client.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to