[ 
https://issues.apache.org/jira/browse/HBASE-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736879#comment-16736879
 ] 

Zheng Hu edited comment on HBASE-21657 at 1/8/19 8:39 AM:
----------------------------------------------------------

I use the patch [1]  and patch.v3 in our cluster to verify what's wrong with 
the stacktrace [2].  did not see any stacktraces in our cluster,  so I guess 
maybe the flamegraph messed up the stacktrace.  btw, i found the 
KeyValueEncoder always call the getSerializedSize without tags, which mean it 
will cost much cpu for caculating the cell size (but the flamegraph did not 
show this), while tags are off 99% of time (as [~stack]  said in RB),  so maybe 
we also can optimize the encoder. 

{code}
org.apache.hadoop.hbase.ByteBufferKeyValue.getSerializedSize(ByteBufferKeyValue.java:294)
org.apache.hadoop.hbase.KeyValueUtil.getSerializedSize(KeyValueUtil.java:753)
org.apache.hadoop.hbase.codec.KeyValueCodec$KeyValueEncoder.write(KeyValueCodec.java:62)
org.apache.hadoop.hbase.ipc.CellBlockBuilder.encodeCellsTo(CellBlockBuilder.java:192)
org.apache.hadoop.hbase.ipc.CellBlockBuilder.buildCellBlockStream(CellBlockBuilder.java:229)
org.apache.hadoop.hbase.ipc.ServerCall.setResponse(ServerCall.java:203)
org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:161)
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
{code}

1. 
https://issues.apache.org/jira/secure/attachment/12954128/debug-the-ByteBufferKeyValue.diff
2. 
https://issues.apache.org/jira/browse/HBASE-21657?focusedCommentId=16735710&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16735710
 


was (Author: openinx):
I use the patch [1]  and patch.v3 in our cluster to verify what's wrong with 
the stacktrace [2].  did not see any stacktraces in our cluster,  so I guess 
maybe the flamegraph messed up the stacktrace.  btw, i found the 
KeyValueEncoder always call the getSerializedSize without tags, which mean it 
will cost much cpu for caculating the cell size (but the flamegraph did not 
show this), while tags are off 99% of time (as [~stack]  said in RB),  so maybe 
we also can optimize the encoder. 

1. 
https://issues.apache.org/jira/secure/attachment/12954128/debug-the-ByteBufferKeyValue.diff
2. 
https://issues.apache.org/jira/browse/HBASE-21657?focusedCommentId=16735710&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16735710
 

> PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% 
> scan case.
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-21657
>                 URL: https://issues.apache.org/jira/browse/HBASE-21657
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
>         Attachments: HBASE-21657.v1.patch, HBASE-21657.v2.patch, 
> HBASE-21657.v3.patch, HBASE-21657.v3.patch, 
> HBase1.4.9-ssd-10000000-rows-flamegraph.svg, 
> HBase1.4.9-ssd-10000000-rows-qps-latency.png, 
> HBase2.0.4-patch-v2-ssd-10000000-rows-qps-and-latency.png, 
> HBase2.0.4-patch-v2-ssd-10000000-rows.svg, 
> HBase2.0.4-patch-v3-ssd-10000000-rows-flamegraph.svg, 
> HBase2.0.4-patch-v3-ssd-10000000-rows-qps-and-latency.png, 
> HBase2.0.4-ssd-10000000-rows-flamegraph.svg, 
> HBase2.0.4-ssd-10000000-rows-qps-latency.png, HBase2.0.4-with-patch.v2.png, 
> HBase2.0.4-without-patch-v2.png, debug-the-ByteBufferKeyValue.diff, 
> hbase2.0.4-ssd-scan-traces.2.svg, hbase2.0.4-ssd-scan-traces.svg, 
> hbase20-ssd-100-scan-traces.svg, image-2019-01-07-19-03-37-930.png, 
> image-2019-01-07-19-03-55-577.png, overview-statstics-1.png, run.log
>
>
> We are evaluating the performance of branch-2, and find that the throughput 
> of scan in SSD cluster is almost the same as HDD cluster. so I made a 
> FlameGraph on RS, and found that the 
> PrivateCellUtil#estimatedSerializedSizeOf cost about 29% cpu, Obviously, it 
> has been the bottleneck in 100% scan case.
> See the [^hbase20-ssd-100-scan-traces.svg]
> BTW, in our XiaoMi branch, we introduce a 
> HRegion#updateReadRequestsByCapacityUnitPerSecond to sum up the size of cells 
> (for metric monitor), so it seems the performance loss was amplified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to