[ 
https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488463#comment-13488463
 ] 

Karthik Ranganathan commented on HBASE-6874:
--------------------------------------------

Thought about the N scanners, its a complicated change - you would have to 
change the entire scan protocol. Each of the next calls in scans are not 
numbered, and so you could go out of whack if prefetching N (and throw in 
exceptions). There is also the basic issue right now that scans do retries 
which is wrong. Also, reasoning about it another way, if your in memory scan 
throughput is > the time to read from disk, you're probably good. I found that 
there are other unrelated bottlenecks preventing this from being the case. Of 
course, if the filtering is very heavy then this will breakdown... you probably 
want to implement prefetching based on the num filtered rows, which should not 
be too hard.

I have a patch I have tested with, but its waiting on HBASE-6770 - that is 
going to refactor scans quite a bit. Will put a patch out once that is done.
                
> Implement prefetching for scanners
> ----------------------------------
>
>                 Key: HBASE-6874
>                 URL: https://issues.apache.org/jira/browse/HBASE-6874
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> I did some quick experiments by scanning data that should be completely in 
> memory and found that adding pre-fetching increases the throughput by about 
> 50% from 26MB/s to 39MB/s.
> The idea is to perform the next in a background thread, and keep the result 
> ready. When the scanner's next comes in, return the pre-computed result and 
> issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to