[ 
https://issues.apache.org/jira/browse/HBASE-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915237#comment-13915237
 ] 

Lars Hofhansl commented on HBASE-10625:
---------------------------------------

Testing this is hard it seems. I wrote a quick tool that runs through the 
scenarios and calculates mean/standard deviation.
10m rows, 5 columns (C0-C4), 8 byte row keys, 8 byte values.

Results are surprisingly disappointing:
||collumns||None||C0||C1||C4||C1,C3||C2,C3||C2,C3,C4||
|w/ patch|13.30, 0.12|14.24, 0.17|22.29, 0.09|16.42, 0.03|31.51, 0.27|24.60, 
0.02|21.04, 0.05|
|w/o patch|13.72, 0.07|14.47, 0.21|23.12, 0.13|17.30,0.16|32.16, 0.05|25.00, 
0.04|21.11, 0.05|

So the gains are minimal.
Will run the same at home later.

> Remove unnecessary key compare from AbstractScannerV2.reseekTo
> --------------------------------------------------------------
>
>                 Key: HBASE-10625
>                 URL: https://issues.apache.org/jira/browse/HBASE-10625
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>         Attachments: 10625-0.94.txt, 10625-trunk.txt
>
>
> In reseekTo we find this
> {code}
> ...
>         compared = compareKey(reader.getComparator(), key, offset, length);
>         if (compared < 1) {
>           // If the required key is less than or equal to current key, then
>           // don't do anything.
>           return compared;
>         } else {
>            ...
>            return loadBlockAndSeekToKey(this.block, this.nextIndexedKey,
>               false, key, offset, length, false);
> ...
> {code}
> loadBlockAndSeekToKey already does the right thing when a we pass a key that 
> sorts before the current key. It's less efficient than this early check, but 
> in the vast (all?) cases we pass forward keys (as required by the reseek 
> contract). We're optimizing the wrong thing.
> Scanning with the ExplicitColumnTracker is 20-30% faster.
> (I tested with rows of 5 short KVs selected the 2nd and or 4th column)
> I propose simply removing that check.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to