[ 
https://issues.apache.org/jira/browse/HBASE-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866400#comment-13866400
 ] 

Chao Shi commented on HBASE-10305:
----------------------------------

Hi Anoop,

Thanks for your quick response and clarification.

I think the root cause is the strong coupling between sync log and region 
operation. Is it possible to do the update for each region in memory and then 
sync HLog once? Looking into the code, I found it is difficult: it has to sync 
HLog before update MVCC and MVCC is on per-region basis.

> Batch update performance drops as the number of regions grows
> -------------------------------------------------------------
>
>                 Key: HBASE-10305
>                 URL: https://issues.apache.org/jira/browse/HBASE-10305
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Chao Shi
>
> In our use case, we use a small number (~5) of proxy programs that read from 
> a queue and batch update to HBase. Our program is multi-threaded and HBase 
> client will batch mutations to each RS.
> We found we're getting lower TPS when there are more regions. I think the 
> reason is RS syncs HLog for each region. Suppose there is a single region, 
> the batch update will only touch one region and therefore syncs HLog once. 
> And suppose there are 10 regions per server, in RS#multi() it have to process 
> update for each individual region and sync HLog 10 times.
> Please note that in our scenario, batched mutations usually are independent 
> with each other and need to touch a various number of regions.
> We are using the 0.94 series, but I think the trunk should have the same 
> problem after a quick look into the code.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to