[ https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490940#comment-13490940 ]
Matt Corgan commented on HBASE-6874: ------------------------------------ Have you guys considered the possibility fetching multiple blocks in a single call to HDFS? If compressed block size is 10KB, then maybe large scans should be requesting 100+ blocks (1MB) at a time given that rotational drives can read several MB in the same time they can do a seek. The prefetch thread could chop the 1MB result into the individual blocks before putting them into the block cache. > Implement prefetching for scanners > ---------------------------------- > > Key: HBASE-6874 > URL: https://issues.apache.org/jira/browse/HBASE-6874 > Project: HBase > Issue Type: Sub-task > Reporter: Karthik Ranganathan > Assignee: Karthik Ranganathan > > I did some quick experiments by scanning data that should be completely in > memory and found that adding pre-fetching increases the throughput by about > 50% from 26MB/s to 39MB/s. > The idea is to perform the next in a background thread, and keep the result > ready. When the scanner's next comes in, return the pre-computed result and > issue another background read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira