[jira] [Commented] (HBASE-9969) Improve KeyValueHeap using loser tree

Lars Hofhansl (JIRA) Sun, 02 Jul 2017 05:54:26 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071624#comment-16071624
 ]


Lars Hofhansl commented on HBASE-9969:
--------------------------------------

I always felt that a class as core to HBase as this should not be based on the 
JDK's PriorityQueue, but be rather implemented directly. Matt's patch does 
that, and it seems that his custom priority queue beats the JDK's in every case.


> Improve KeyValueHeap using loser tree
> -------------------------------------
>
>                 Key: HBASE-9969
>                 URL: https://issues.apache.org/jira/browse/HBASE-9969
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, regionserver
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>         Attachments: 9969-0.94.txt, hbase-9969.patch, hbase-9969.patch, 
> hbase-9969-pq-v1.patch, hbase-9969-pq-v2.patch, hbase-9969-v2.patch, 
> hbase-9969-v3.patch, KeyValueHeapBenchmark_v1.ods, 
> KeyValueHeapBenchmark_v2.ods, kvheap-benchmark.png, kvheap-benchmark.txt
>
>
> LoserTree is the better data structure than binary heap. It saves half of the 
> comparisons on each next(), though the time complexity is on O(logN).
> Currently A scan or get will go through two KeyValueHeaps, one is merging KVs 
> read from multiple HFiles in a single store, the other is merging results 
> from multiple stores. This patch should improve the both cases whenever CPU 
> is the bottleneck (e.g. scan with filter over cached blocks, HBASE-9811).
> All of the optimization work is done in KeyValueHeap and does not change its 
> public interfaces. The new code looks more cleaner and simpler to understand.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-9969) Improve KeyValueHeap using loser tree

Reply via email to