[ https://issues.apache.org/jira/browse/HBASE-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943896#comment-13943896 ]
Lars Hofhansl commented on HBASE-9969: -------------------------------------- A colleague of mine (Jamie Martin to give credit) implemented something similar for different project. I think some of the performance issues we're seeing is that the scanners and the heap are not really integrated. In order to pop something of the topScanner we need to take the scanner out of the heap, pop the next KV, seek the scanner, and put it back on the heap, which now has to rejigger everything. His solution is a simple sorted array of scanners the scanner with the next key is first. When we pop something of that first scanner we simply compare the next key with the top key of the 2nd scanner, when larger (i.e. it's the next) we're done, just one comparison, no movement in the array needed. If not the scanner is insert-sorted into the right spot. The assumption is that that most keys come from the same scanner and in that case should safe a lot of overhead. It's somewhat similar to [~mcorgan] attempt to insert the scanner back at the beginning, but integrates the scanner with the "heap" and should safe overhead. It's also simple and should be easy to follow. If I get time I'm going to implement this as an alternative into Matt's testing framework here and see how it fares. > Improve KeyValueHeap using loser tree > ------------------------------------- > > Key: HBASE-9969 > URL: https://issues.apache.org/jira/browse/HBASE-9969 > Project: HBase > Issue Type: Improvement > Components: Performance, regionserver > Reporter: Chao Shi > Assignee: Chao Shi > Fix For: 0.99.0 > > Attachments: 9969-0.94.txt, KeyValueHeapBenchmark_v1.ods, > KeyValueHeapBenchmark_v2.ods, hbase-9969-pq-v1.patch, hbase-9969-pq-v2.patch, > hbase-9969-v2.patch, hbase-9969-v3.patch, hbase-9969.patch, hbase-9969.patch, > kvheap-benchmark.png, kvheap-benchmark.txt > > > LoserTree is the better data structure than binary heap. It saves half of the > comparisons on each next(), though the time complexity is on O(logN). > Currently A scan or get will go through two KeyValueHeaps, one is merging KVs > read from multiple HFiles in a single store, the other is merging results > from multiple stores. This patch should improve the both cases whenever CPU > is the bottleneck (e.g. scan with filter over cached blocks, HBASE-9811). > All of the optimization work is done in KeyValueHeap and does not change its > public interfaces. The new code looks more cleaner and simpler to understand. -- This message was sent by Atlassian JIRA (v6.2#6252)