Hi,

We upgraded our cluster from CDH5.3.1(HBase0.98.6) to CDH5.4.5(HBase1.0.0)
and we experience slowdown in increment operation.

Here's an extract from thread dump of the RegionServer of our cluster:

Thread 68 (RW.default.writeRpcServer.handler=15,queue=5,port=60020):
  State: BLOCKED
  Blocked count: 21689888
  Waited count: 39828360
  Blocked on java.util.LinkedList@3474e4b2
  Blocked by 63 (RW.default.writeRpcServer.handler=10,queue=0,port=60020)
  Stack:
    java.lang.Object.wait(Native Method)

org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:224)

org.apache.hadoop.hbase.regionserver.MultiVersionConsistencyControl.waitForPreviousTransactionsComplete(MultiVersionConsistencyControl.java:203)

org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:6712)

org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:501)

org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:570)

org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:1901)

org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31451)
    org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)
    org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)

org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
    org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
    java.lang.Thread.run(Thread.java:745)

There are many similar threads in the thread dump.

I read the source code and I think this is caused by changes of
MultiVersionConsistencyControl.
A region lock (not a row lock) seems to occur in
waitForPreviousTransactionsComplete().


Also we wrote performance test code for increment operation that included
100 threads and ran it in local mode.

The result is shown below:

CDH5.3.1(HBase0.98.6)
Throughput(op/s): 12757, Latency(ms): 7.975072509210629

CDH5.4.5(HBase1.0.0)
Throughput(op/s): 2027, Latency(ms): 49.11840157868772


Thanks,

Toshihiro Suzuki

Reply via email to