[jira] [Commented] (HBASE-13262) ResultScanner doesn't return all rows in Scan

Josh Elser (JIRA) Tue, 24 Mar 2015 14:06:06 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378610#comment-14378610
 ]


Josh Elser commented on HBASE-13262:
------------------------------------

Just attached some v3 patches which will hopefully fix all of the tests and 
checkstyle stuff.

Thanks for the review, Jonathan.

bq. Not sure what this line means, do we need to check the more results flag 
here?

Yeah, I only changed ClientScanner to start using 
{{ScanResponse#getMoreResults()}}. I assume that the reverse (small) scanner 
should also start using it in the long run. I left them as-is for the time just 
to not get more sidetracked with them and focus on getting ClientScanner 
correct.

{quote}
I like the idea of using moreResults flag but I believe we need to actually 
introduce a new flag into the ScanResponse. Unfortunately, the name moreResults 
is a little misleading as it seems perfect for what we are trying to achieve. 
Looking into RSRpcServices to see when this moreResults flag is set to false, 
it looks like this happens only when scanner.isFilterDone() is true. Looking 
closer, RegionScannerImpl#isFilterDone is only true when the RegionScanner 
wants to indicate that the entire scan should stop (i.e. the client shouldn't 
even try to change regions, the whole scan is done).

So to be clear, it looks as though the moreResults flag is false ONLY when the 
entire scan needs to stop, NOT when a region is exhausted. The net effect is 
that moreResults will always appear to be true client side, even when the 
region is exhausted. Thus, I think we will still end up making that extra RPC 
that Lars mentioned above in order to see that the Result[] is empty and thus 
the region is exhausted, before the region change occurs.
{quote}

Hrm, it appears you are correct once again. I had thought I had some changes in 
RSRpcServices for this already, but perhaps I lost them in the shuffle. Also a 
sign that my tests aren't sufficient, I suppose. Is there a reason you think 
that we need to introduce a new attribute on the message though? As long as the 
server fills in that attribute correctly in all cases, it would work w/o 
changing the message structure, no? That would hopefully make moreResults a bit 
less obtuse for client-use (not to mention the improvements already present WRT 
null results carrying some special logic in ClientScanner and ScannerCallable).

{quote}
bq. // Server didn't respond whether it has more results or not.

Is it possible here that we may inadvertently interpret the missing flag as 
meaning the region is exhausted? Probably fine because the limit logic is still 
in the ClientScanner while condition, just wondering.
{quote}

Yeah, that was a drawback to dealing with it like I did. I was trying to think 
through the case of what happens when talking to an older server (or one that 
just doesn't give the flag). I think the best we can do is fall back to the old 
logic and pray we don't run into the problem? I'd need to do more digging 
through older versions to be 100% certain though.

> ResultScanner doesn't return all rows in Scan
> ---------------------------------------------
>
>                 Key: HBASE-13262
>                 URL: https://issues.apache.org/jira/browse/HBASE-13262
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.0, 1.1.0
>         Environment: Single node, pseduo-distributed 1.1.0-SNAPSHOT
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 2.0.0, 1.1.0, 0.98.13
>
>         Attachments: 13262-0.98-testpatch.txt, HBASE-13262-branch-1-v2.patch, 
> HBASE-13262-branch-1-v3.patch, HBASE-13262-branch-1.patch, 
> HBASE-13262-v1.patch, HBASE-13262-v2.patch, HBASE-13262-v3.patch, 
> HBASE-13262.patch, regionserver-logging.diff, testrun_0.98.txt, 
> testrun_branch1.0.txt
>
>
> Tried to write a simple Java client again 1.1.0-SNAPSHOT.
> * Write 1M rows, each row with 1 family, and 10 qualifiers (values [0-9]), 
> for a total of 10M cells written
> * Read back the data from the table, ensure I saw 10M cells
> Running it against {{04ac1891}} (and earlier) yesterday, I would get ~20% of 
> the actual rows. Running against 1.0.0, returns all 10M records as expected.
> [Code I was 
> running|https://github.com/joshelser/hbase-hwhat/blob/master/src/main/java/hbase/HBaseTest.java]
>  for the curious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13262) ResultScanner doesn't return all rows in Scan

Reply via email to