[jira] [Comment Edited] (HBASE-20710) extra cloneFamily() in Mutation.add(Cell)

2018-06-14 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513182#comment-16513182
 ] 

huaxiang sun edited comment on HBASE-20710 at 6/15/18 12:45 AM:


Yeah, for the table.put path, we can save copyFamily as it has been already 
copied out. For table.batch case (cellblock), this copy is unavoidable. I have 
attached the allocation profiles for the put case (netty rpc server), with the 
patch, the cloneFamily() disappears in the allocation profile. It was run with 
PE randomWrite. Minor bonus, with the same family buffer, key compare for 
search in TreeMap is a shortcircuit, it does not compare byte to byte.


was (Author: huaxiang):
Yeah, for the table.put path, we can save copyFamily as it has been already 
copied out. For table.batch case (cellblock), this copy is unavoidable. I have 
attached the allocation profiles for the put case (netty rpc server), with the 
patch, the cloneFamily() disappears in the allocation profile. It was run with 
PE randomWrite.

> extra cloneFamily() in Mutation.add(Cell)
> -
>
> Key: HBASE-20710
> URL: https://issues.apache.org/jira/browse/HBASE-20710
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 2.0.1
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
> Fix For: 2.0.1
>
> Attachments: HBASE-20710-master-v001.patch, 
> HBASE-20710-master-v002.patch, alloc-put-fix.svg, alloc-put-orig.svg
>
>
> The cpu profiling shows that during PE randomWrite testing, about 1 percent 
> of time is spent in cloneFamily. Reviewing code found that when a cell is DBB 
> backed ByteBuffKeyValueCell (which is default with Netty Rpc), 
> cell.getFamilyArray() will call cloneFamily() and there is again a 
> cloneFamily() in the following line of the code. since this is the critical 
> write path processing, this needs to be optimized.
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L791
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L795



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20710) extra cloneFamily() in Mutation.add(Cell)

2018-06-09 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507030#comment-16507030
 ] 

stack edited comment on HBASE-20710 at 6/9/18 3:12 PM:
---

+1

I had this as a 'fix' in branch:

diff --git 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
index a6ddc14cca..1b750b950e 100644
--- a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
+++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
@@ -788,11 +788,10 @@ public abstract class Mutation extends 
OperationWithAttributes implements Row, C
 " doesn't match the original one " +  Bytes.toStringBinary(this.row));
 }

-if (cell.getFamilyArray() == null || cell.getFamilyLength() == 0) {
+byte [] family = CellUtil.cloneFamily(cell);
+if (family == null || family.length == 0) {
   throw new IllegalArgumentException("Family cannot be null");
 }
-
-byte[] family = CellUtil.cloneFamily(cell);
 if (cell instanceof ExtendedCell) {
   getCellList(family).add(cell);
 } else {


The offence is the extra copy and the deserializations to find lengths to use 
locating the family bytes to copy.

Good one [~huaxiang]. Go for it.




was (Author: stack):
+1

I had this as a 'fix' in branch:

diff --git 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
index a6ddc14cca..1b750b950e 100644
--- a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
+++ b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
@@ -788,11 +788,10 @@ public abstract class Mutation extends 
OperationWithAttributes implements Row, C
 " doesn't match the original one " +  Bytes.toStringBinary(this.row));
 }

-if (cell.getFamilyArray() == null || cell.getFamilyLength() == 0) {
+byte [] family = CellUtil.cloneFamily(cell);
+if (family == null || family.length == 0) {
   throw new IllegalArgumentException("Family cannot be null");
 }
-
-byte[] family = CellUtil.cloneFamily(cell);
 if (cell instanceof ExtendedCell) {
   getCellList(family).add(cell);
 } else {


The offence is the extra copy and the deserializations to find lengths to use 
locating the family bytes to copy.

Good one [~huaxiang].



> extra cloneFamily() in Mutation.add(Cell)
> -
>
> Key: HBASE-20710
> URL: https://issues.apache.org/jira/browse/HBASE-20710
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 2.0.1
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
> Fix For: 2.0.1
>
>
> The cpu profiling shows that during PE randomWrite testing, about 1 percent 
> of time is spent in cloneFamily. Reviewing code found that when a cell is DBB 
> backed ByteBuffKeyValueCell (which is default with Netty Rpc), 
> cell.getFamilyArray() will call cloneFamily() and there is again a 
> cloneFamily() in the following line of the code. since this is the critical 
> write path processing, this needs to be optimized.
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L791
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java#L795



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)