[
https://issues.apache.org/jira/browse/HBASE-10169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849883#comment-13849883
]
Gary Helmling commented on HBASE-10169:
---------------------------------------
Sure, the way I see #1 is simply batching the RPCs performed for
HRegion.execService() invocations for regions on the same regionserver. In the
same way that a multi-get will do a single RPC to return results from Get
requests across multiple regions on a single regionserver, a batched
coprocessor service request for all the relevant regions on a regionserver
could return a single response containing multiple CoprocessorServiceResponse
objects.
The high-level execution would look like:
# on the client (in HTable.coprocessorService()) group regions involved in a
coprocessorService() request by regionserver
# create a request object per-regionserver containing multiple
CoprocessorServiceRequest instances (one per region)
# the regionserver would execute the individual requests against each region,
calling HRegion.execService()
# before returning, the regionserver aggregates the individual responses into a
single response object containing multiple CoprocessorServiceResponse instances
(again one per region)
# the client, on receiving the response, invokes Batch.Callback.update() with
the contents of each CoprocessorServiceResponse
HRegionServer.multi() provides a good model for this, I think.
This could all happen transparently, using the existing
HTable.coprocessorService() client interface and would be a massive improvement
in RPC efficiency.
Regarding #2, providing user-defined aggregations on the server-side (or
"combiners" as described in HBASE-5762) could provide further efficiency
improvements in limiting response bandwidth for some use-cases, but I think it
deserves to be looked at on it's own, given that it would create an entirely
new user-facing API.
> Batch coprocessor
> -----------------
>
> Key: HBASE-10169
> URL: https://issues.apache.org/jira/browse/HBASE-10169
> Project: HBase
> Issue Type: Sub-task
> Components: Coprocessors
> Affects Versions: 0.99.0
> Reporter: Jingcheng Du
> Assignee: Jingcheng Du
> Attachments: Batch Coprocessor Design Document.docx, HBASE-10169.patch
>
>
> This is designed to improve the coprocessor invocation in the client side.
> Currently the coprocessor invocation is to send a call to each region. If
> there’s one region server, and 100 regions are located in this server, each
> coprocessor invocation will send 100 calls, each call uses a single thread in
> the client side. The threads will run out soon when the coprocessor
> invocations are heavy.
> In this design, all the calls to the same region server will be grouped into
> one in a single coprocessor invocation. This call will be spread into each
> region in the server side, and the results will be merged ahead in the server
> side before being returned to the client.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)