[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-08-07 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172941#comment-17172941
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-24754 at 8/7/20, 7:08 AM:
-

Thanks [~sreenivasulureddy]
Can you see the patch 
https://issues.apache.org/jira/secure/attachment/13009270/Branc2_withComparator_atKeyValue.patch
We change the Comparator to use KV comparator only because we know we are 
dealing with Keyvalues only. Also avoided the hierarchy that the branch-2 
creates. 
Also attached the flamegraphs for branch-1 and branch-2 and after patching 
branch-2. Seems branch-2 perf was very random and not a constant. May be it is 
due to the hierarchy tree for the CellComparator. I did try the CellComparator 
way by avoiding all the if/else conditions in the comparator code to accomodate 
different cell types but that was also not giving a consistent perf. After I 
changed to use KV comparator directly with the above changes performance became 
consistent and on par with branch-1.3. 
[~sreenivasulureddy] - can you pls check .


was (Author: ram_krish):
[~sreenivasulureddy]
Can you the patch at 
https://issues.apache.org/jira/secure/attachment/13009270/Branc2_withComparator_atKeyValue.patch
We change the Comparator to use KV comparator only because we know we are 
dealing with Keyvalues only. Also avoided the hierarchy that the branch-2 
creates. 
Also attached the flamegraphs for branch-1 and branch-2 and after patching 
branch-2. Seems branch-2 perf was very random and not a constant. May be it is 
due to the hierarchy tree for the CellComparator. I did try the CellComparator 
way by avoiding all the if/else conditions in the comparator code to accomodate 
different cell types but that was also not giving a consistent perf. After I 
changed to use KV comparator directly with the above changes performance became 
consistent and on par with branch-1.3. 
[~sreenivasulureddy] - can you pls check .

> Bulk load performance is degraded in HBase 2 
> -
>
> Key: HBASE-24754
> URL: https://issues.apache.org/jira/browse/HBASE-24754
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 2.2.3
>Reporter: Ajeet Rai
>Assignee: ramkrishna.s.vasudevan
>Priority: Major
> Attachments: Branc2_withComparator_atKeyValue.patch, 
> Branch1.3_putSortReducer_sampleCode.patch, 
> Branch2_putSortReducer_sampleCode.patch, flamegraph_branch-1_new.svg, 
> flamegraph_branch-2.svg, flamegraph_branch-2_afterpatch.svg
>
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
>  Test Input: 
> 1: Table with 500 region(300 column family)
> 2:  data =2 TB
> Data Sample
> 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
> 3: Cluster: 7 node(2 master+5 Region Server)
>  4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both 
> cluster
>  
> |Feature|HBase 2.2.3
>  Time(Sec)|HBase 1.3.1
>  Time(Sec)|Diff%|Snappy lib:
>   |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
>  HBase 2.2.3: 1.4
>  HBase 1.3.1: 1.4|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-08-06 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172541#comment-17172541
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-24754 at 8/6/20, 5:16 PM:
-

I was able to verify in my local linux VM and the significant drop is due to 
the Comparator. 

The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much 
from 15 to 22 secs. 

See the stack trace and that explains the reason 
Branch-1.3
{code}
main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable 
[0x7f6003887000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897)
at java.util.TreeMap.put(TreeMap.java:552)
at java.util.TreeSet.add(TreeSet.java:255)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157)
{code}
Where as in the branch-2 code base
{code}
"main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable 
[0x7f4a507bb000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:)
at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337)
at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353)
at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368)
at 
org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:86)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:67)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:45)
at java.util.TreeMap.put(TreeMap.java:552)
at java.util.TreeSet.add(TreeSet.java:255)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:191)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:242)
{code}
So we do more work to do the comparison when we have large rows. I think the 
similar thing is happening out in the other issue where we try to filter out 
large number of rows during a scan. (just saying but that i have not spent time 
on that ).


was (Author: ram_krish):
I was able to verify in my local linux VM and the significant drop is due to 
the Comparator. 

The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much 
from 15 to 22 secs. 

See the stack trace and that explains the reason 
Branch-1.3
{code}
main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable 
[0x7f6003887000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897)
at java.util.TreeMap.put(TreeMap.java:552)
at java.util.TreeSet.add(TreeSet.java:255)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157)

Where as in the branch-2 code base
{code}
"main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable 
[0x7f4a507bb000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:)
at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337)
at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353)
at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368)
at 
org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:86)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:67)
at 
org.apa

[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-08-06 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172541#comment-17172541
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-24754 at 8/6/20, 5:16 PM:
-

I was able to verify in my local linux VM and the significant drop is due to 
the Comparator. 

The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much 
from 15 to 22 secs. 

See the stack trace and that explains the reason 
Branch-1.3
{code}
main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable 
[0x7f6003887000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897)
at java.util.TreeMap.put(TreeMap.java:552)
at java.util.TreeSet.add(TreeSet.java:255)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157)

Where as in the branch-2 code base
{code}
"main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable 
[0x7f4a507bb000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:)
at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337)
at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353)
at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368)
at 
org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:86)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:67)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:45)
at java.util.TreeMap.put(TreeMap.java:552)
at java.util.TreeSet.add(TreeSet.java:255)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:191)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:242)
{code}
So we do more work to do the comparison when we have large rows. I think the 
similar thing is happening out in the other issue where we try to filter out 
large number of rows during a scan. (just saying but that i have not spent time 
on that ).


was (Author: ram_krish):
I was able to verify in my local linux VM and the significant drop is due to 
the Comparator. 

The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much 
from 15 to 22 secs. 

See the stack trace and that explains the reason 
Branch-1.3
{code}
main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable 
[0x7f6003887000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897)
at java.util.TreeMap.put(TreeMap.java:552)
at java.util.TreeSet.add(TreeSet.java:255)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104)
at 
org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157)
{code}
Where the code there is
{code}
 return Bytes.compareTo(left, loffset + lfamilylength,
llength - lfamilylength,
right, roffset + rfamilylength, rlength - rfamilylength);
{code}
Where as in the branch-2 code base
{code}
"main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable 
[0x7f4a507bb000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127)
at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:)
at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337)
at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353)
at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368)
at 
org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105)
at 
org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266)
a

[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-08-04 Thread Y. SREENIVASULU REDDY (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170687#comment-17170687
 ] 

Y. SREENIVASULU REDDY edited comment on HBASE-24754 at 8/4/20, 9:58 AM:


Verified the mapper task operations, and data writing operation done by 
HFileOutputFormat2. There didn't observed any time taking operations.

But Time differences observed in the PutSortReducer class for processing the 
"Put" objects.  

For the same executed the tests and posted the results here, please find the 
attached sample code to reproduce the issue, for between the Branch-2 and 
Branch-1.3
 In Reduce operation to process the "PUT" objects observed the difference ~30% 
reduced.

1. Verified the test with 10 rows.
 2. Each row size is ~1K.
 3. Each row have single column-family and 300 qualifiers
 4. Tested with java version (JDK1.8.0_232)
 5. Test Results
||Rows processing Time||Branch 1.3 Time (ms)||Branch 2 Time (ms)||%Difference||
|Test 1|12545|18955|-33.8|
|Test 2|12693|18840|-32.6|
|Test 3|12694|18939|-32.9|


was (Author: sreenivasulureddy):
Attached the sample code to reproduce the issue, for between the Branch-2 and 
Branch-1.3
In Reduce operation to process the "PUT" objects observed the difference ~30% 
reduced.

1. Verified the test with 10 rows.
2. Each row size is ~1K.
3. Each row have single column-family and 300 qualifiers
4. Tested with java version (JDK1.8.0_232)
5. Test Results
||Rows processing Time||Branch 1.3 Time (ms)||Branch 2 Time (ms)||%Difference||
|Test 1|12545|18955|-33.8|
|Test 2|12693|18840|-32.6|
|Test 3|12694|18939|-32.9|

> Bulk load performance is degraded in HBase 2 
> -
>
> Key: HBASE-24754
> URL: https://issues.apache.org/jira/browse/HBASE-24754
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 2.2.3
>Reporter: Ajeet Rai
>Priority: Major
> Attachments: Branch1.3_putSortReducer_sampleCode.patch, 
> Branch2_putSortReducer_sampleCode.patch
>
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
>  Test Input: 
> 1: Table with 500 region(300 column family)
> 2:  data =2 TB
> Data Sample
> 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
> 3: Cluster: 7 node(2 master+5 Region Server)
>  4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both 
> cluster
>  
> |Feature|HBase 2.2.3
>  Time(Sec)|HBase 1.3.1
>  Time(Sec)|Diff%|Snappy lib:
>   |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
>  HBase 2.2.3: 1.4
>  HBase 1.3.1: 1.4|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-07-29 Thread Yechao Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167589#comment-17167589
 ] 

Yechao Chen edited comment on HBASE-24754 at 7/30/20, 1:40 AM:
---

yes ,not related with HBASE-24971


was (Author: chenyechao):
yes ,note related with HBASE-24971

> Bulk load performance is degraded in HBase 2 
> -
>
> Key: HBASE-24754
> URL: https://issues.apache.org/jira/browse/HBASE-24754
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 2.2.3
>Reporter: Ajeet Rai
>Priority: Major
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
>  Test Input: 
> 1: Table with 500 region(300 column family)
> 2:  data =2 TB
> Data Sample
> 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
> 3: Cluster: 7 node(2 master+5 Region Server)
>  4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both 
> cluster
>  
> |Feature|HBase 2.2.3
>  Time(Sec)|HBase 1.3.1
>  Time(Sec)|Diff%|Snappy lib:
>   |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
>  HBase 2.2.3: 1.4
>  HBase 1.3.1: 1.4|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-07-29 Thread Pankaj Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167117#comment-17167117
 ] 

Pankaj Kumar edited comment on HBASE-24754 at 7/29/20, 12:19 PM:
-

HBASE-24791 current changes (PR#2167) are not completely applicable to 
branch-2.2. After backport to branch-2.2 or 2.2.3, minor changes are there 
which won't regain the exepcted performance IMO.


was (Author: pankajkumar):
HBASE-24791 current changes (PR#2167) are not completely applicable to 
branch-2.2,  we will backport this Jira and share the feedback.

> Bulk load performance is degraded in HBase 2 
> -
>
> Key: HBASE-24754
> URL: https://issues.apache.org/jira/browse/HBASE-24754
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 2.2.3
>Reporter: Ajeet Rai
>Priority: Major
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
>  Test Input: 
> 1: Table with 500 region(300 column family)
> 2:  data =2 TB
> Data Sample
> 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
> 3: Cluster: 7 node(2 master+5 Region Server)
>  4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both 
> cluster
>  
> |Feature|HBase 2.2.3
>  Time(Sec)|HBase 1.3.1
>  Time(Sec)|Diff%|Snappy lib:
>   |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
>  HBase 2.2.3: 1.4
>  HBase 1.3.1: 1.4|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2

2020-07-29 Thread Pankaj Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167117#comment-17167117
 ] 

Pankaj Kumar edited comment on HBASE-24754 at 7/29/20, 10:30 AM:
-

HBASE-24791 current changes (PR#2167) are not completely applicable to 
branch-2.2,  we will backport this Jira and share the feedback.


was (Author: pankajkumar):
HBASE-24791 not completely applicable to branch-2.2,  we will backport this 
Jira and share the feedback.

> Bulk load performance is degraded in HBase 2 
> -
>
> Key: HBASE-24754
> URL: https://issues.apache.org/jira/browse/HBASE-24754
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 2.2.3
>Reporter: Ajeet Rai
>Priority: Major
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
>  Test Input: 
> 1: Table with 500 region(300 column family)
> 2:  data =2 TB
> Data Sample
> 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1
> 3: Cluster: 7 node(2 master+5 Region Server)
>  4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both 
> cluster
>  
> |Feature|HBase 2.2.3
>  Time(Sec)|HBase 1.3.1
>  Time(Sec)|Diff%|Snappy lib:
>   |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
>  HBase 2.2.3: 1.4
>  HBase 1.3.1: 1.4|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)