[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172941#comment-17172941 ] ramkrishna.s.vasudevan edited comment on HBASE-24754 at 8/7/20, 7:08 AM: - Thanks [~sreenivasulureddy] Can you see the patch https://issues.apache.org/jira/secure/attachment/13009270/Branc2_withComparator_atKeyValue.patch We change the Comparator to use KV comparator only because we know we are dealing with Keyvalues only. Also avoided the hierarchy that the branch-2 creates. Also attached the flamegraphs for branch-1 and branch-2 and after patching branch-2. Seems branch-2 perf was very random and not a constant. May be it is due to the hierarchy tree for the CellComparator. I did try the CellComparator way by avoiding all the if/else conditions in the comparator code to accomodate different cell types but that was also not giving a consistent perf. After I changed to use KV comparator directly with the above changes performance became consistent and on par with branch-1.3. [~sreenivasulureddy] - can you pls check . was (Author: ram_krish): [~sreenivasulureddy] Can you the patch at https://issues.apache.org/jira/secure/attachment/13009270/Branc2_withComparator_atKeyValue.patch We change the Comparator to use KV comparator only because we know we are dealing with Keyvalues only. Also avoided the hierarchy that the branch-2 creates. Also attached the flamegraphs for branch-1 and branch-2 and after patching branch-2. Seems branch-2 perf was very random and not a constant. May be it is due to the hierarchy tree for the CellComparator. I did try the CellComparator way by avoiding all the if/else conditions in the comparator code to accomodate different cell types but that was also not giving a consistent perf. After I changed to use KV comparator directly with the above changes performance became consistent and on par with branch-1.3. [~sreenivasulureddy] - can you pls check . > Bulk load performance is degraded in HBase 2 > - > > Key: HBASE-24754 > URL: https://issues.apache.org/jira/browse/HBASE-24754 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 2.2.3 >Reporter: Ajeet Rai >Assignee: ramkrishna.s.vasudevan >Priority: Major > Attachments: Branc2_withComparator_atKeyValue.patch, > Branch1.3_putSortReducer_sampleCode.patch, > Branch2_putSortReducer_sampleCode.patch, flamegraph_branch-1_new.svg, > flamegraph_branch-2.svg, flamegraph_branch-2_afterpatch.svg > > > in our Test,It is observed that Bulk load performance is degraded in HBase 2 . > Test Input: > 1: Table with 500 region(300 column family) > 2: data =2 TB > Data Sample > 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1 > 3: Cluster: 7 node(2 master+5 Region Server) > 4: No of Container Launched are same in both case > HBase 2 took 10% more time then HBase 1.3 where test input is same for both > cluster > > |Feature|HBase 2.2.3 > Time(Sec)|HBase 1.3.1 > Time(Sec)|Diff%|Snappy lib: > | > |BulkLoad|21837|19686.16|-10.93|Snappy lib: > HBase 2.2.3: 1.4 > HBase 1.3.1: 1.4| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172541#comment-17172541 ] ramkrishna.s.vasudevan edited comment on HBASE-24754 at 8/6/20, 5:16 PM: - I was able to verify in my local linux VM and the significant drop is due to the Comparator. The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much from 15 to 22 secs. See the stack trace and that explains the reason Branch-1.3 {code} main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable [0x7f6003887000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897) at java.util.TreeMap.put(TreeMap.java:552) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157) {code} Where as in the branch-2 code base {code} "main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable [0x7f4a507bb000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:) at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337) at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353) at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368) at org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406) at org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169) at org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105) at org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:86) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:67) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:45) at java.util.TreeMap.put(TreeMap.java:552) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:191) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:242) {code} So we do more work to do the comparison when we have large rows. I think the similar thing is happening out in the other issue where we try to filter out large number of rows during a scan. (just saying but that i have not spent time on that ). was (Author: ram_krish): I was able to verify in my local linux VM and the significant drop is due to the Comparator. The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much from 15 to 22 secs. See the stack trace and that explains the reason Branch-1.3 {code} main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable [0x7f6003887000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897) at java.util.TreeMap.put(TreeMap.java:552) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157) Where as in the branch-2 code base {code} "main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable [0x7f4a507bb000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:) at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337) at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353) at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368) at org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406) at org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169) at org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105) at org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:86) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:67) at org.apa
[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172541#comment-17172541 ] ramkrishna.s.vasudevan edited comment on HBASE-24754 at 8/6/20, 5:16 PM: - I was able to verify in my local linux VM and the significant drop is due to the Comparator. The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much from 15 to 22 secs. See the stack trace and that explains the reason Branch-1.3 {code} main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable [0x7f6003887000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897) at java.util.TreeMap.put(TreeMap.java:552) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157) Where as in the branch-2 code base {code} "main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable [0x7f4a507bb000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:) at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337) at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353) at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368) at org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406) at org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169) at org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105) at org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:86) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:67) at org.apache.hadoop.hbase.CellComparatorImpl.compare(CellComparatorImpl.java:45) at java.util.TreeMap.put(TreeMap.java:552) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:191) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:242) {code} So we do more work to do the comparison when we have large rows. I think the similar thing is happening out in the other issue where we try to filter out large number of rows during a scan. (just saying but that i have not spent time on that ). was (Author: ram_krish): I was able to verify in my local linux VM and the significant drop is due to the Comparator. The branch-1.3 took consistenly ~11 to 12 secs but the branch-2 is varying much from 15 to 22 secs. See the stack trace and that explains the reason Branch-1.3 {code} main" #1 prio=5 os_prio=0 tid=0x7f5ffc010800 nid=0x4b0b runnable [0x7f6003887000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1897) at java.util.TreeMap.put(TreeMap.java:552) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce1Row(PutSortReducer.java:104) at org.apache.hadoop.hbase.mapreduce.PutSortReducer.main(PutSortReducer.java:157) {code} Where the code there is {code} return Bytes.compareTo(left, loffset + lfamilylength, llength - lfamilylength, right, roffset + rfamilylength, rlength - rfamilylength); {code} Where as in the branch-2 code base {code} "main" #1 prio=5 os_prio=0 tid=0x7f4a48016000 nid=0x488a runnable [0x7f4a507bb000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hbase.util.Bytes$ConverterHolder$UnsafeConverter.toShort(Bytes.java:1533) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:1127) at org.apache.hadoop.hbase.util.Bytes.toShort(Bytes.java:) at org.apache.hadoop.hbase.KeyValue.getRowLength(KeyValue.java:1337) at org.apache.hadoop.hbase.KeyValue.getFamilyOffset(KeyValue.java:1353) at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1368) at org.apache.hadoop.hbase.KeyValue.getQualifierLength(KeyValue.java:1406) at org.apache.hadoop.hbase.CellComparatorImpl.compareQualifiers(CellComparatorImpl.java:169) at org.apache.hadoop.hbase.CellComparatorImpl.compareColumns(CellComparatorImpl.java:105) at org.apache.hadoop.hbase.CellComparatorImpl.compareWithoutRow(CellComparatorImpl.java:266) a
[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170687#comment-17170687 ] Y. SREENIVASULU REDDY edited comment on HBASE-24754 at 8/4/20, 9:58 AM: Verified the mapper task operations, and data writing operation done by HFileOutputFormat2. There didn't observed any time taking operations. But Time differences observed in the PutSortReducer class for processing the "Put" objects. For the same executed the tests and posted the results here, please find the attached sample code to reproduce the issue, for between the Branch-2 and Branch-1.3 In Reduce operation to process the "PUT" objects observed the difference ~30% reduced. 1. Verified the test with 10 rows. 2. Each row size is ~1K. 3. Each row have single column-family and 300 qualifiers 4. Tested with java version (JDK1.8.0_232) 5. Test Results ||Rows processing Time||Branch 1.3 Time (ms)||Branch 2 Time (ms)||%Difference|| |Test 1|12545|18955|-33.8| |Test 2|12693|18840|-32.6| |Test 3|12694|18939|-32.9| was (Author: sreenivasulureddy): Attached the sample code to reproduce the issue, for between the Branch-2 and Branch-1.3 In Reduce operation to process the "PUT" objects observed the difference ~30% reduced. 1. Verified the test with 10 rows. 2. Each row size is ~1K. 3. Each row have single column-family and 300 qualifiers 4. Tested with java version (JDK1.8.0_232) 5. Test Results ||Rows processing Time||Branch 1.3 Time (ms)||Branch 2 Time (ms)||%Difference|| |Test 1|12545|18955|-33.8| |Test 2|12693|18840|-32.6| |Test 3|12694|18939|-32.9| > Bulk load performance is degraded in HBase 2 > - > > Key: HBASE-24754 > URL: https://issues.apache.org/jira/browse/HBASE-24754 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 2.2.3 >Reporter: Ajeet Rai >Priority: Major > Attachments: Branch1.3_putSortReducer_sampleCode.patch, > Branch2_putSortReducer_sampleCode.patch > > > in our Test,It is observed that Bulk load performance is degraded in HBase 2 . > Test Input: > 1: Table with 500 region(300 column family) > 2: data =2 TB > Data Sample > 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1 > 3: Cluster: 7 node(2 master+5 Region Server) > 4: No of Container Launched are same in both case > HBase 2 took 10% more time then HBase 1.3 where test input is same for both > cluster > > |Feature|HBase 2.2.3 > Time(Sec)|HBase 1.3.1 > Time(Sec)|Diff%|Snappy lib: > | > |BulkLoad|21837|19686.16|-10.93|Snappy lib: > HBase 2.2.3: 1.4 > HBase 1.3.1: 1.4| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167589#comment-17167589 ] Yechao Chen edited comment on HBASE-24754 at 7/30/20, 1:40 AM: --- yes ,not related with HBASE-24971 was (Author: chenyechao): yes ,note related with HBASE-24971 > Bulk load performance is degraded in HBase 2 > - > > Key: HBASE-24754 > URL: https://issues.apache.org/jira/browse/HBASE-24754 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 2.2.3 >Reporter: Ajeet Rai >Priority: Major > > in our Test,It is observed that Bulk load performance is degraded in HBase 2 . > Test Input: > 1: Table with 500 region(300 column family) > 2: data =2 TB > Data Sample > 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1 > 3: Cluster: 7 node(2 master+5 Region Server) > 4: No of Container Launched are same in both case > HBase 2 took 10% more time then HBase 1.3 where test input is same for both > cluster > > |Feature|HBase 2.2.3 > Time(Sec)|HBase 1.3.1 > Time(Sec)|Diff%|Snappy lib: > | > |BulkLoad|21837|19686.16|-10.93|Snappy lib: > HBase 2.2.3: 1.4 > HBase 1.3.1: 1.4| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167117#comment-17167117 ] Pankaj Kumar edited comment on HBASE-24754 at 7/29/20, 12:19 PM: - HBASE-24791 current changes (PR#2167) are not completely applicable to branch-2.2. After backport to branch-2.2 or 2.2.3, minor changes are there which won't regain the exepcted performance IMO. was (Author: pankajkumar): HBASE-24791 current changes (PR#2167) are not completely applicable to branch-2.2, we will backport this Jira and share the feedback. > Bulk load performance is degraded in HBase 2 > - > > Key: HBASE-24754 > URL: https://issues.apache.org/jira/browse/HBASE-24754 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 2.2.3 >Reporter: Ajeet Rai >Priority: Major > > in our Test,It is observed that Bulk load performance is degraded in HBase 2 . > Test Input: > 1: Table with 500 region(300 column family) > 2: data =2 TB > Data Sample > 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1 > 3: Cluster: 7 node(2 master+5 Region Server) > 4: No of Container Launched are same in both case > HBase 2 took 10% more time then HBase 1.3 where test input is same for both > cluster > > |Feature|HBase 2.2.3 > Time(Sec)|HBase 1.3.1 > Time(Sec)|Diff%|Snappy lib: > | > |BulkLoad|21837|19686.16|-10.93|Snappy lib: > HBase 2.2.3: 1.4 > HBase 1.3.1: 1.4| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24754) Bulk load performance is degraded in HBase 2
[ https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167117#comment-17167117 ] Pankaj Kumar edited comment on HBASE-24754 at 7/29/20, 10:30 AM: - HBASE-24791 current changes (PR#2167) are not completely applicable to branch-2.2, we will backport this Jira and share the feedback. was (Author: pankajkumar): HBASE-24791 not completely applicable to branch-2.2, we will backport this Jira and share the feedback. > Bulk load performance is degraded in HBase 2 > - > > Key: HBASE-24754 > URL: https://issues.apache.org/jira/browse/HBASE-24754 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 2.2.3 >Reporter: Ajeet Rai >Priority: Major > > in our Test,It is observed that Bulk load performance is degraded in HBase 2 . > Test Input: > 1: Table with 500 region(300 column family) > 2: data =2 TB > Data Sample > 186000120150205100068110,1860001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,1 > 3: Cluster: 7 node(2 master+5 Region Server) > 4: No of Container Launched are same in both case > HBase 2 took 10% more time then HBase 1.3 where test input is same for both > cluster > > |Feature|HBase 2.2.3 > Time(Sec)|HBase 1.3.1 > Time(Sec)|Diff%|Snappy lib: > | > |BulkLoad|21837|19686.16|-10.93|Snappy lib: > HBase 2.2.3: 1.4 > HBase 1.3.1: 1.4| -- This message was sent by Atlassian Jira (v8.3.4#803005)