[ https://issues.apache.org/jira/browse/HBASE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leitao Guo updated HBASE-5982: ------------------------------ Summary: HBase Coprocessor Local Aggregation (was: HBase Coprocessor Locate) > HBase Coprocessor Local Aggregation > ----------------------------------- > > Key: HBASE-5982 > URL: https://issues.apache.org/jira/browse/HBASE-5982 > Project: HBase > Issue Type: Improvement > Components: Coprocessors > Affects Versions: 0.92.1 > Environment: cloudera-cdh3u3,hbase-0.92.1 > Reporter: dengpeng > Assignee: dengpeng > Labels: Coprocessor > Fix For: 0.92.1 > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > In our application, we need to handle the following SQL-like process on > hbase. There are very complex processes on each region, and the result of > 'top #' from each region will be sent back to the coprocessor client in the > current region-based endpoint framework. > Let's take the following SQL as an example. Suppose there are 100 regions in > each RS and there are 100 RSs in the cluster, the client will receive > 100*100*1M = 10G records from all the region, and then select top 1M records > from 10G records. The client need much RAM to handle these data and the > network of the cluster maybe the bottleneck. > If we have the RS-based endpoint, each RS will handle parts of result from > its regions, the client will receive 100*1M = 0.1G records. The burden of the > client and the network will dramatically reduced. > example: > select top 1000000 count(1) as A , sum(intRxlevDL)/count(intRxlevDL) as B , > intBscPc as bscPc , intLac as LAC , intCI as CI from ftbMrMsg t1 where ( > t1.dtTime >= '2012-03-02 04:00:00.000' and t1.dtTime < '2012-03-02 > 05:00:00.000' )group by bscPc , LAC , CI having B >= 0.2order by bscPc ASC , > LAC ASC , CI ASC > So far, the network is a bottleneck in our application when using coprocessor > to handle the above SQL. I think the RS-based Endpoint is worth doing, > especially for the 'top #' process. What's your opinion about this? I think > we can open a jira. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira