[jira] [Comment Edited] (HBASE-13262) ResultScanner doesn't return all rows in Scan

Andrew Purtell (JIRA) Fri, 20 Mar 2015 11:30:06 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371763#comment-14371763
 ]


Andrew Purtell edited comment on HBASE-13262 at 3/20/15 6:28 PM:
-----------------------------------------------------------------

An extra RPC will be quite bad for performance, especially for clients like 
Phoenix, which does a lot of short scans using guideposts. We should backport 
the NextState (or similar) changes all the way back to 0.98 as I proposed 
above. I will do this if nobody else wants to do the work. I don't see any 
reason the changes cannot be both backwards and forwards compatible.

bq. Maybe we can add a flag indicating that the client "knows" about the flag 
in the openScanner() request. If the client knows about the flag, server sends 
the flag, if not, the client does an extra RPC.

This shouldn't be necessary. The server can just send NextState. If the client 
knows about it, it can do the right thing, avoiding unnecessary RPC. Otherwise 
we can fall back to the safest possible alternative, such as not estimating 
size on the client at all, therefore incurring costs like an extra RPC.


was (Author: apurtell):
An extra RPC will be quite bad for performance, especially for clients like 
Phoenix, which does a lot of short scans using guideposts. We should backport 
the NextState (or similar) changes all the way back to 0.98 as I proposed 
above. I will do this if nobody else wants to do the work. I don't see any 
reason the changes cannot be both backwards and forwards compatible.

bq. Maybe we can add a flag indicating that the client "knows" about the flag 
in the openScanner() request. If the client knows about the flag, server sends 
the flag, if not, the client does an extra RPC.

This shouldn't be necessary. The server can just send NextState. If the client 
knows about it, it can do the right thing. Otherwise we can fall back to the 
safest possible alternative, such as not estimating size on the client at all.

> ResultScanner doesn't return all rows in Scan
> ---------------------------------------------
>
>                 Key: HBASE-13262
>                 URL: https://issues.apache.org/jira/browse/HBASE-13262
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.0, 1.1.0
>         Environment: Single node, pseduo-distributed 1.1.0-SNAPSHOT
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 2.0.0, 1.1.0, 0.98.13
>
>         Attachments: 13262-0.98-testpatch.txt, regionserver-logging.diff, 
> testrun_0.98.txt, testrun_branch1.0.txt
>
>
> Tried to write a simple Java client again 1.1.0-SNAPSHOT.
> * Write 1M rows, each row with 1 family, and 10 qualifiers (values [0-9]), 
> for a total of 10M cells written
> * Read back the data from the table, ensure I saw 10M cells
> Running it against {{04ac1891}} (and earlier) yesterday, I would get ~20% of 
> the actual rows. Running against 1.0.0, returns all 10M records as expected.
> [Code I was 
> running|https://github.com/joshelser/hbase-hwhat/blob/master/src/main/java/hbase/HBaseTest.java]
>  for the curious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-13262) ResultScanner doesn't return all rows in Scan

Reply via email to