[ 
https://issues.apache.org/jira/browse/HBASE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368251#comment-14368251
 ] 

Josh Elser commented on HBASE-13262:
------------------------------------

Ok, been a while since I posted some progress, here's my current understanding 
of things and hopefully an easier to grok statement of the problem:

When clients request a batch of rows which is larger than the server is 
configured to return (often, when the client
does not explicitly set a limit to the results to be returned from the server), 
the client will incorrectly treat this
as all data in the current region has been exhausted. This goes back to what 
[~jonathan.lawlor] pointed out about
clients and servers needing to stay in sync WRT the size of a batch of 
{{Result}}s. The client ultimately requests the
server return a batch of size 'hbase.client.scanner.max.result.size' and then 
believe that the server returned less data
than that limit.

A client-side workaround to the problem is to reduce the number of rows 
requested on the {{Scan}} via
({{Scan#setCaching(int)}}. Setting this value sufficiently low enough (for my 
test code, anything less than 1000 seems
to do the trick) will cause the server to flush the results back to the client 
before the server gets close
to the size limit which would cause the client to do the wrong thing.

I still don't completely understand what is causing the difference on the 
server-side in the first place (over 0.98). I
need to dig more there to understand things. I'm not sure if I'm just missing 
somewhere that
{{CellUtil#estimatedHeapSizeOf(Cell)}} isn't being used, or if some size is 
bubbling up through the {{NextState}} via
the {{KeyValueHeap}} (and thus MemStores or StoreFiles), or something entirely 
different.

Ultimately, the underlying problem is likely best addressed from the stance 
that a scanner shouldn't be performing
special logic based on the size of the batch of data returned from a server. In 
other words, the
client should not be making logic decisions based solely on the size or length 
of the {{Result[]}} it receives.

The server already maintains a nice enum of the reason which it returns a batch 
of results to a client via
{{NextState$State}}. The server has the answer to our question when returns a 
batch: is this batch return due
a limitation on size of this batch (either length or bytes)?

I'm currently of the opinion that it's ideal to pass this information back to 
the client via the {{ScanResult}}.
Ignoring wire-version issues for the moment, this means that clients would rely 
on this new enum to determine when
there is more data to read from a Region and when a Region is exhausted 
(instead of the size and length checks of the
{{Result[]}}.

This approach wouldn't break 0.98 clients against 1.x; however, it also 
wouldn't address the underlying problem of the
client guessing at what to do based on the characteristics of the {{Result[]}} 
when it is unaware of the existence of
this new field in the protobuf. Given my understanding of the problem, 0.98 
clients running against 1.x *could* see
this problem, although I have not tested that to confirm it happens.

Obviously, I need to do some more digging as to where the mismatch in size is 
coming from (unless I missed it from
Jonathan earlier on) before I get a patch. Thoughts/comments welcome meanwhile.

> ResultScanner doesn't return all rows in Scan
> ---------------------------------------------
>
>                 Key: HBASE-13262
>                 URL: https://issues.apache.org/jira/browse/HBASE-13262
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.0, 1.1.0
>         Environment: Single node, pseduo-distributed 1.1.0-SNAPSHOT
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: testrun_0.98.txt, testrun_branch1.0.txt
>
>
> Tried to write a simple Java client again 1.1.0-SNAPSHOT.
> * Write 1M rows, each row with 1 family, and 10 qualifiers (values [0-9]), 
> for a total of 10M cells written
> * Read back the data from the table, ensure I saw 10M cells
> Running it against {{04ac1891}} (and earlier) yesterday, I would get ~20% of 
> the actual rows. Running against 1.0.0, returns all 10M records as expected.
> [Code I was 
> running|https://github.com/joshelser/hbase-hwhat/blob/master/src/main/java/hbase/HBaseTest.java]
>  for the curious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to