[ 
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250562#comment-17250562
 ] 

Michael Stack commented on HBASE-24754:
---------------------------------------

Chatting w/ a coworker, he talked of being able to make a call high-up on what 
types of Cells/KVs are involved and before we start the task, make a call on 
the CellComparator to use (even suggested auto-generating the optimal... ). 
Seems like you can do this when bulk loading.Can look at the file and figure 
what the Cell type.... And then choose a CellComparator to use... one w/ no 
branching shaped to fit the Cells it will see.  Are we set up to allow 
inserting a particular CellComparator to use in MR tasks?  Good stuff.

> Bulk load performance is degraded in HBase 2 
> ---------------------------------------------
>
>                 Key: HBASE-24754
>                 URL: https://issues.apache.org/jira/browse/HBASE-24754
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>    Affects Versions: 2.2.3
>            Reporter: Ajeet Rai
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.5.0
>
>         Attachments: Branc2_withComparator_atKeyValue.patch, 
> Branch1.3_putSortReducer_sampleCode.patch, 
> Branch2_putSortReducer_sampleCode.patch, flamegraph_branch-1_new.svg, 
> flamegraph_branch-2.svg, flamegraph_branch-2_afterpatch.svg
>
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
>  Test Input: 
> 1: Table with 500 region(300 column family)
> 2:  data =2 TB
> Data Sample
> 18600000001201502051000000068110,18600000001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111111111111111111111111111111111111111111111111111111111111111111111111111111111
> 3: Cluster: 7 node(2 master+5 Region Server)
>  4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both 
> cluster
>  
> |Feature|HBase 2.2.3
>  Time(Sec)|HBase 1.3.1
>  Time(Sec)|Diff%|Snappy lib:
>   |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
>  HBase 2.2.3: 1.4
>  HBase 1.3.1: 1.4|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to