[ 
https://issues.apache.org/jira/browse/DRILL-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178361#comment-16178361
 ] 

ASF GitHub Bot commented on DRILL-5808:
---------------------------------------

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/958

    DRILL-5808: Reduce memory allocator strictness for "managed" operators

    The "managed" external sort and the hash agg operators now actively attempt 
to stay within a memory "budget." 
    
    Out goals are to:
    
    1. Stay within the budget, and
    2. Make full use of the available memory.
    
    Unfortunately, at present, Drill has a number of limitations that work at 
cross-purposes to the above goal.
    
    * Upstream operators create record batches potentially larger than the 
memory budget.
    * Memory allocations are "lumpy" - power of two rounded.
    * Vectors double in size automatically when needed.
    
    The combination of the above means that memory planning must be aware of 
the size of each and every vector to the byte level in order to predict size 
doubling and power-of-two rounding.
    
    But, of course, Drill is schema-on-read, meaning that Drill cannot know 
ahead of time the "shape" of the data it will process. Without that 
information, memory estimates are, at best, averages, but actual allocations 
have a wide variance around those averages.
    
    Add to this Drill's memory allocation scheme: each operator is given a 
strict budget enforced by the memory allocator. Go above the budget by a single 
byte and the query dies.
    
    How do we resolve this conflict? On the one hand, Drill's internals are 
rough-and-ready; it is impossible to predict actual memory usage. On the other 
hand, the allocator requires perfect prediction else the user suffers with 
failed queries.
    
    Much work is needed in Drill internals to provide for better memory 
management. (Relational databases have long ago solved the issues, so solutions 
are available.) Until then, this commit introduces a work-around.
    
    Memory-managed operators can ask for "leniency" from the allocator. In this 
mode, the allocator:
    
    * Allows actual memory use to spike up to 100% of the limit, or 100 MB, 
whichever is less,
    * Logs each such "excess allocation" as a warning, so we can identify and 
fix issues, and
    * Allows leniency only in production environments, but not during 
development or test.
    
    That is, we give users a margin for error so that their queries succeed 
even if Drill's memory calculations don't come out exactly right.
    
    This should be fine because, of course, Drill still has several operators 
that observe no memory limits at all. Seems silly to have one operator using 
GBs of memory, while enforcing a typical 30 MB limit on others.
    
    Until all operators are memory managed, and Drill provides better memory 
management tools, this PR allows queries to succeed even if we get things 
slightly wrong internally.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-5808

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/958.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #958
    
----
commit a9c5083b8743efa2b5c74fee77e12d8f69258601
Author: Paul Rogers <prog...@maprtech.com>
Date:   2017-09-24T19:51:43Z

    DRILL-5808: Reduce memory allocator strictness for "managed" operators

----


> Reduce memory allocator strictness for "managed" operators
> ----------------------------------------------------------
>
>                 Key: DRILL-5808
>                 URL: https://issues.apache.org/jira/browse/DRILL-5808
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.12.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Drill 1.11 and 1.12 introduce new "managed" versions of the sort and hash agg 
> that enforce memory limits, spilling to disk when necessary.
> Drill's internal memory system is very "lumpy" and unpredictable. The 
> operators have no control over the incoming batch size; an overly large batch 
> can cause the operator to exceed its memory limit before it has a chance to 
> do any work.
> Vector allocations grow in power-of-two sizes. Adding a single record can 
> double the memory allocated to a vector.
> Drill has no metadata, so operators cannot predict the size of VarChar 
> columns nor the cardinality of arrays. The "Record Batch Sizer" tries to 
> extract this information on each batch, but it works with averages, and 
> specific column patterns can still throw off the memory calculations. (For 
> example, having a series of very wide columns for A-M and very narrow columns 
> for N-Z will cause a moderate average. But, once sorted, the A-M rows, and 
> batches, will be much larger than expected, causing out-of-memory errors.)
> At present, if an operator is wrong in its memory usage by a single byte, the 
> entire query is killed. That is, the user pays the death penalty (of queries) 
> for poor design decisions within Drill. This leads to a less-than-optimal 
> user experience.
> The proposal here is to make the memory allocator less strict for "managed" 
> operators.
> First, we recognize that the managed operators do attempt to control memory 
> and, if designed well, will, on average hit their targets.
> Second, we recognize that, due to the lumpiness issues above, any single 
> operator may exceed, or be under, the configured maximum memory.
> Given this, the proposal here is:
> 1. An operator identifies itself as managed to the memory allocator.
> 2. In managed mode, the allocator has soft limits. It emits a warning to the 
> log when the limit is exceeded.
> 3. For safety, in managed mode, the allocator enforces a hard limit larger 
> than the configured limit.
> The enforcement limit might be:
> * For memory sizes < 100MB, up to 2x the configured limit.
> * For larger memory sizes, no more than 100MB over the configured limit.
> The exact numbers can be made configurable.
> Now, during testing, scripts should look for over-memory warnings. Each 
> should be fixed as we fix OOM issues today. But, during production, user 
> queries are far less likely to fail due to any remaining corner cases that 
> throw off the memory calculations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to