[jira] [Comment Edited] (HBASE-20710) extra cloneFamily() in Mutation.add(Cell)
[ https://issues.apache.org/jira/browse/HBASE-20710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16513182#comment-16513182 ] huaxiang sun edited comment on HBASE-20710 at 6/15/18 12:45 AM: Yeah, for the table.put path, we can save copyFamily as it has been already copied out. For table.batch case (cellblock), this copy is unavoidable. I have attached the allocation profiles for the put case (netty rpc server), with the patch, the cloneFamily() disappears in the allocation profile. It was run with PE randomWrite. Minor bonus, with the same family buffer, key compare for search in TreeMap is a shortcircuit, it does not compare byte to byte. was (Author: huaxiang): Yeah, for the table.put path, we can save copyFamily as it has been already copied out. For table.batch case (cellblock), this copy is unavoidable. I have attached the allocation profiles for the put case (netty rpc server), with the patch, the cloneFamily() disappears in the allocation profile. It was run with PE randomWrite. > extra cloneFamily() in Mutation.add(Cell) > - > > Key: HBASE-20710 > URL: https://issues.apache.org/jira/browse/HBASE-20710 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: 2.0.1 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Minor > Fix For: 2.0.1 > > Attachments: HBASE-20710-master-v001.patch, > HBASE-20710-master-v002.patch, alloc-put-fix.svg, alloc-put-orig.svg > > > The cpu profiling shows that during PE randomWrite testing, about 1 percent > of time is spent in cloneFamily. Reviewing code found that when a cell is DBB > backed ByteBuffKeyValueCell (which is default with Netty Rpc), > cell.getFamilyArray() will call cloneFamily() and there is again a > cloneFamily() in the following line of the code. since this is the critical > write path processing, this needs to be optimized. > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L791 > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L795 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20710) extra cloneFamily() in Mutation.add(Cell)
[ https://issues.apache.org/jira/browse/HBASE-20710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507030#comment-16507030 ] stack edited comment on HBASE-20710 at 6/9/18 3:12 PM: --- +1 I had this as a 'fix' in branch: diff --git a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java index a6ddc14cca..1b750b950e 100644 --- a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java +++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java @@ -788,11 +788,10 @@ public abstract class Mutation extends OperationWithAttributes implements Row, C " doesn't match the original one " + Bytes.toStringBinary(this.row)); } -if (cell.getFamilyArray() == null || cell.getFamilyLength() == 0) { +byte [] family = CellUtil.cloneFamily(cell); +if (family == null || family.length == 0) { throw new IllegalArgumentException("Family cannot be null"); } - -byte[] family = CellUtil.cloneFamily(cell); if (cell instanceof ExtendedCell) { getCellList(family).add(cell); } else { The offence is the extra copy and the deserializations to find lengths to use locating the family bytes to copy. Good one [~huaxiang]. Go for it. was (Author: stack): +1 I had this as a 'fix' in branch: diff --git a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java index a6ddc14cca..1b750b950e 100644 --- a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java +++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java @@ -788,11 +788,10 @@ public abstract class Mutation extends OperationWithAttributes implements Row, C " doesn't match the original one " + Bytes.toStringBinary(this.row)); } -if (cell.getFamilyArray() == null || cell.getFamilyLength() == 0) { +byte [] family = CellUtil.cloneFamily(cell); +if (family == null || family.length == 0) { throw new IllegalArgumentException("Family cannot be null"); } - -byte[] family = CellUtil.cloneFamily(cell); if (cell instanceof ExtendedCell) { getCellList(family).add(cell); } else { The offence is the extra copy and the deserializations to find lengths to use locating the family bytes to copy. Good one [~huaxiang]. > extra cloneFamily() in Mutation.add(Cell) > - > > Key: HBASE-20710 > URL: https://issues.apache.org/jira/browse/HBASE-20710 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Affects Versions: 2.0.1 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Minor > Fix For: 2.0.1 > > > The cpu profiling shows that during PE randomWrite testing, about 1 percent > of time is spent in cloneFamily. Reviewing code found that when a cell is DBB > backed ByteBuffKeyValueCell (which is default with Netty Rpc), > cell.getFamilyArray() will call cloneFamily() and there is again a > cloneFamily() in the following line of the code. since this is the critical > write path processing, this needs to be optimized. > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L791 > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L795 -- This message was sent by Atlassian JIRA (v7.6.3#76005)