[ 
https://issues.apache.org/jira/browse/HBASE-28043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784193#comment-17784193
 ] 

Becker Ewing commented on HBASE-28043:
--------------------------------------

After a ton of fantastic feedback from [~bbeaudreault], the additional proof of 
concept is not only still as fast, but much more readable and easy to 
understand. I run the initial reverse scan benchmarks from the performance 
evaluation CLI tool to make sure that this patch is still equally as fast.

 

The process is similar to the past and uses the following methodology:

*Setup*
 # Start a local HBase cluster with "hbase master start --localRegionServers=1"
 # Prepare the table state for benchmarking with "hbase pe --nomapred=true 
--valueSize=30 --blockEncoding=PREFIX --compress=GZ randomWrite 1" (this gives 
us a table that has many rows in a block)
 # Flush and compact the table under test to ensure that tests are operating 
over an equivalent single storefile (& the memstore doesn't influence results):
{code:java}
$ hbase shell
> flush ‘TestTable’
> major_compact ‘TestTable’ {code}

*Benchmarking*
 # This is as simple as starting the local HBase cluster and running "hbase pe 
--nomapred=true reverseScan 1" for each test. The first test is typically a bit 
slower as blocks are decompressed and added to the block cache during this run. 
I ran 3 tests, 1 cold and 2 directly afterwards on the hot cache to try and 
show the cold and hot performance.

 

I got the following results:

 
||Benchmark||Revision||Time (s)||Throughput (rows/sec)||Throughput (MB/s)||
|reverse scan run #1|master|44.745|23,434|23.06|
|reverse scan run #2|master|41.284|25,399|25.00|
|reverse scan run #3|master|41.066|25,534|25.13|
|reverse scan run #1|patch|25.157|41,681|41.02|
|reverse scan run #2|patch|23.663|44,313|43.61|
|reverse scan run #3|patch|23.869|43,930|43.24|

 

So this patch still looks to bring a significant performance improvement even 
after all the refactoring. In fact, it looks like the refactoring made the 
existing reverse scan path and the newer reverse scan path more performant & 
readable. 

> Reduce seeks from beginning of block in StoreFileScanner.seekToPreviousRow
> --------------------------------------------------------------------------
>
>                 Key: HBASE-28043
>                 URL: https://issues.apache.org/jira/browse/HBASE-28043
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Becker Ewing
>            Assignee: Becker Ewing
>            Priority: Major
>         Attachments: Current_SeekToPreviousRowBehavior.png, 
> Proposed_SeekToPreviousRowBehavior.png
>
>
> Currently, for non-RIV1 DBE encodings, each call to 
> [StoreFileScanner.seekToPreviousRow|https://github.com/apache/hbase/blob/89ca7f4ade84c84a246281c71898543b6161c099/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java#L493-L506]
>  (a common operation in reverse scans) results in two seeks: 
>  # Seek from the beginning of the block to before the given row to find the 
> prior row
>  # Seek from the beginning of the block to the first cell of the prior row
> So if there are "N" rows in a block, a reverse scan through each row results 
> in seeking past 2(N-1)! rows.
>  
> This is a particularly expensive operation for tall tables that have many 
> rows in a block.
>  
> By introducing a state variable "previousRow" to StoreFileScanner, I believe 
> that we could modify the seeking algorithm to be:
>  # Seek from the beginning of the block to before the given row to find the 
> prior row
>  # Seek from the beginning of the block to before the row that is before the 
> row that was just seeked to (i.e. 2 rows back). _Save_ this as a hint for 
> where the prior row is in "previousRow"
>  # Reseek from "previousRow" (2 rows back from start) to 1 row back from 
> start (to the actual previousRow)
> Then the rest of the calls where a "previousRow" is present, you just need to 
> seek to the beginning of the block once instead of twice, i.e. 
>  # seek from the beginning of the block to right before the beginning of your 
> "previousRow" marker. Save this as the new "previousRow" marker
>  # Reseek to the next row (i.e. your previous "previousRow" marker)
>  
> If there are "N" rows in a block, a reverse scan from row N to row 0 results 
> in seeking past approximately (N-1)! rows i.e. 50% less than the current 
> behavior.
>  
> See the attached diagrams for the current and proposed behavior. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to