[ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Lawlor updated HBASE-11544:
------------------------------------
    Attachment: HBASE-11544-v1.patch

Hey folks,

I've been working on this issue and I am attaching a patch of what I have so 
far. Below I have included some discussion points that would be great to get 
some feedback on:

A few issues were encountered while implementing a solution for this problem. 
The issues, as well as their current solutions, are outlined below (any 
feedback on alternative ways to solve these problems would be appreciated):
        * In some cases, the concept of partial results doesn't seem 
appropriate. In these cases, I ensured that partial results would not be 
created as it would only hurt performance or cause confusion. The cases where I 
felt partial results should be avoided were:
                ** When the client has defined a filter for their scan that 
requires the entire row to be read. 
                ** When the client has specified that the scan is a Small scan. 
Small scans are designed to execute in a single RPC request and so the idea of 
having to make multiple RPC requests to form the complete Result seems 
inappropriate
        * When I changed the default value of caching to Integer.MAX_VALUE I 
was running into OOME on the server since caching is used to presize the 
ArrayList that holds results. A simple solution to this is to simply not set an 
initial size on the array list. However, this solution may still run into 
memory issues if the ArrayList must expand the underlying array many times 
(e.g. if the table being scanned has many small rows leading to a large amount 
of Results in the array list). I was wondering what everyone thought of the 
simple solution. If a more sophisticated solution is required it may be best to 
move the caching change into its own JIRA.
        * When combining the partial results into a single complete result on 
the client side, an exception will be thrown from within ResultScanner#next() 
if it is found that the partial results belong to different rows. This is a 
corner case issue that should never show up since sequence numbers are already 
used in each RPC request to ensure proper ordering of request/responses but I 
figured it is worth mentioning

The fine grained details of implementation can be seen in the patch, but I 
thought it would be worth highlighting  how this new partial result workflow 
can be used to avoid OOME on the server:
        * The setting of Scan#setMaxResultSize will now operate at the cell 
level rather than the row level. This allows a client to retrieve very large 
rows in fragments/partials that would previously cause the server to OOME. By 
default, the entire complete result will only be formed on the client side, 
whereas the server will only ever see partial Results for very large rows.
        * A new option (Scan#setAllowPartials) has been added to Scans to allow 
the client to see the partial results returned by the server. This setting will 
be useful in cases where the client would OOME if they were forced to 
reconstruct the complete result. 
        * If clients want to utilize this partial result workflow, they should 
use non-filtered, non-small scans (see issues above for reasoning).

Areas for future improvement:
        * As [~lhofhansl] has pointed out, RPC is inefficient and could be 
improved by prefetching results server side. This issue has been raised in 
HBASE-12994
        * As called out in the issues above, the initial sizing of the 
ArrayList on the server side seems like it could be improved to avoid resizing 
of the underlying array
        * Streaming is the most ideal workflow for RPC requests but will 
require a large rework

Any feedback on the patch would be greatly appreciated. I am expecting the QA 
run to come back with some test failures which I will address in a subsequent 
patch. I'm pinging [~lhofhansl] and [~stack] as we were discussing this 
solution above, but if anyone else has any feedback it would be appreciated as 
well!

Thanks

> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11544
>                 URL: https://issues.apache.org/jira/browse/HBASE-11544
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Jonathan Lawlor
>            Priority: Critical
>              Labels: beginner
>         Attachments: HBASE-11544-v1.patch
>
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to