[
https://issues.apache.org/jira/browse/HBASE-20710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509937#comment-16509937
]
huaxiang sun commented on HBASE-20710:
--------------------------------------
Thanks [~mdrob] for review, will address the comments. The main idea is as
follows:
Cellblock:
[family1:qualifer1, v1], [family1:qualifer2, v2], [family1:qualifer3, v3] ....
cell1 cell2
cell3
The first family byte array "family1" is added to the map(familyAdded). For
cell2, its family is read into familyFromCell(allocate once). After it finds
out that it is the same family, it will use familyAdded to put the cell2 into
the TreeMap(very fast). For cell3, the family is read into
familyFromCell(already allocated, no new allocation is needed), it will again
compare with familyAdded and reuse familyAdded for put into the TreeMap. For
cell4 and on, there will be no new allocation for family, and familyFromCell is
reused.
With this, there is no need to clone family for each cell and save heap
allocation. Compared with the pre-patch case, the save is huge as it calls
cloneFamily() twice for each cell (cellblock case). Similar applies to normal
put case.
> extra cloneFamily() in Mutation.add(Cell)
> -----------------------------------------
>
> Key: HBASE-20710
> URL: https://issues.apache.org/jira/browse/HBASE-20710
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver
> Affects Versions: 2.0.1
> Reporter: huaxiang sun
> Assignee: huaxiang sun
> Priority: Minor
> Fix For: 2.0.1
>
> Attachments: HBASE-20710-master-v001.patch
>
>
> The cpu profiling shows that during PE randomWrite testing, about 1 percent
> of time is spent in cloneFamily. Reviewing code found that when a cell is DBB
> backed ByteBuffKeyValueCell (which is default with Netty Rpc),
> cell.getFamilyArray() will call cloneFamily() and there is again a
> cloneFamily() in the following line of the code. since this is the critical
> write path processing, this needs to be optimized.
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L791
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L795
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)