[ 
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338032#comment-14338032
 ] 

Lars Hofhansl edited comment on HBASE-13099 at 2/26/15 7:06 AM:
----------------------------------------------------------------

That's what small scans do (in a nutshell), when they are not small :)

That does mean that at every 1mb chunk we need to reseek all 
\{region|store|storeFile\}Scanners. I.e. the server state allows us to avoid 
the expensive seeking each RPC. Maybe with 1mb chunks it does not matter. (but 
you can pull 1mb over 1ge in < 10ms, which is less then the seek time of an 
HDD).

Some of the chunking logic we get with HBASE-12976.



was (Author: lhofhansl):
That's what small scans do (in a nutshell), when they are not small :)

That does mean that at every 1mb chunk we need to reseek are 
{region|store|storeFile}Scanners. I.e. the server state allows us to avoid the 
expensive seeking each RPC. Maybe with 1mb chunks it does not matter. (but you 
can pull 1mb over 1ge in < 10ms, which is less then the seek time of an HDD).

Some of the chunking logic we get with HBASE-12976.


> Scans as in DynamoDB
> --------------------
>
>                 Key: HBASE-13099
>                 URL: https://issues.apache.org/jira/browse/HBASE-13099
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Client, regionserver
>            Reporter: Nicolas Liochon
>
> cc: [~saint....@gmail.com] - as discussed offline.
> DynamoDB has a very simple way to manage scans server side:
> ??citation??
> The data returned from a Query or Scan operation is limited to 1 MB; this 
> means that if you scan a table that has more than 1 MB of data, you'll need 
> to perform another Scan operation to continue to the next 1 MB of data in the 
> table.
> If you query or scan for specific attributes that match values that amount to 
> more than 1 MB of data, you'll need to perform another Query or Scan request 
> for the next 1 MB of data. To do this, take the LastEvaluatedKey value from 
> the previous request, and use that value as the ExclusiveStartKey in the next 
> request. This will let you progressively query or scan for new data in 1 MB 
> increments.
> When the entire result set from a Query or Scan has been processed, the 
> LastEvaluatedKey is null. This indicates that the result set is complete 
> (i.e. the operation processed the “last page” of data).
> ??citation??
> This means that there is no state server side: the work is done client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to