[ https://issues.apache.org/jira/browse/HBASE-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343575#comment-14343575 ]
Lars Hofhansl commented on HBASE-13109: --------------------------------------- bq. make the nextIndexedKEy to a Cell and create a KeyOnlyKeyValue out of it. Then use the normal CellComparator.compare(cell, cell). I agree that would be nicer. But that would be *very* slow. This decision is made for every single KeyValue (if the SQM decides that it wants to seek). That's why I added a special compare that the key can be compared in place without creating a KV object or even a new byte[]. Note that there would be two Cell to be created: (1) the Cell representing the indexed key, and (2) the Cell representing the seek key in the SQM. With this patch no new objects are created at all. Just avoiding the creating the key array saved 0.7s over 4m rows (5 cols). Making KeyValues or Cells would be more expensive. > Make better SEEK vs SKIP decisions during scanning > -------------------------------------------------- > > Key: HBASE-13109 > URL: https://issues.apache.org/jira/browse/HBASE-13109 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Priority: Minor > Attachments: 13109-trunk-v2.txt, 13109-trunk-v3.txt, 13109-trunk.txt > > > I'm re-purposing this issue to add a heuristic as to when to SEEK and when to > SKIP Cells. This has come up in various issues, and I think I have a way to > finally fix this now. HBASE-9778, HBASE-12311, and friends are related. > --- Old description --- > This is a continuation of HBASE-9778. > We've seen a scenario of a very slow scan over a region using a timerange > that happens to fall after the ts of any Cell in the region. > Turns out we spend a lot of time seeking. > Tested with a 5 column table, and the scan is 5x faster when the timerange > falls before all Cells' ts. > We can use the lookahead hint introduced in HBASE-9778 to do opportunistic > SKIPing before we actually seek. -- This message was sent by Atlassian JIRA (v6.3.4#6332)