[jira] [Commented] (HBASE-13721) Improve shell scan performances when using LIMIT

Jean-Marc Spaggiari (JIRA) Wed, 20 May 2015 07:42:31 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552412#comment-14552412
 ]


Jean-Marc Spaggiari commented on HBASE-13721:
---------------------------------------------

So here is what I did.

1) Moved the break condition before we do the next call to hasNext to avoid 
long delays when hasNext never come back until the end of the table
2) Use the LIMIT scan parameter to set the maximum caching size so we come back 
faster.

The same request now returns in few milliseconds instead of 12 seconds.

> Improve shell scan performances when using LIMIT
> ------------------------------------------------
>
>                 Key: HBASE-13721
>                 URL: https://issues.apache.org/jira/browse/HBASE-13721
>             Project: HBase
>          Issue Type: Bug
>          Components: shell
>    Affects Versions: 1.1.0
>            Reporter: Jean-Marc Spaggiari
>            Assignee: Jean-Marc Spaggiari
>         Attachments: HBASE-13721-v0-trunk.txt
>
>
> When doing a scan which is expected to return the exact same number of rows 
> as the LIMIT we give, we still scan the entire table until we return the 
> row(s) and then test the numbers of rows we have. This can take a lot of time.
> Example:
> scan 'sensors', { COLUMNS => ['v:f92acb5b-079a-42bc-913a-657f270a3dc1'], 
> STARTROW => '000a', LIMIT => 1 }
> This is because we will break on the limit condition AFTER we ask for the 
> next row. If there is none, we scan the entire table than exit.
> Goal of this patch is to handle this specific case without impacting the 
> others.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13721) Improve shell scan performances when using LIMIT

Reply via email to