[jira] [Created] (DRILL-5808) Reduce memory allocator strictness for "managed" operators

Paul Rogers (JIRA) Wed, 20 Sep 2017 10:12:30 -0700

Paul Rogers created DRILL-5808:
----------------------------------

             Summary: Reduce memory allocator strictness for "managed" operators
                 Key: DRILL-5808
                 URL: https://issues.apache.org/jira/browse/DRILL-5808
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.11.0
            Reporter: Paul Rogers
            Assignee: Paul Rogers
             Fix For: 1.12.0



Drill 1.11 and 1.12 introduce new "managed" versions of the sort and hash agg 
that enforce memory limits, spilling to disk when necessary.

Drill's internal memory system is very "lumpy" and unpredictable. The operators 
have no control over the incoming batch size; an overly large batch can cause 
the operator to exceed its memory limit before it has a chance to do any work.

Vector allocations grow in power-of-two sizes. Adding a single record can 
double the memory allocated to a vector.

Drill has no metadata, so operators cannot predict the size of VarChar columns 
nor the cardinality of arrays. The "Record Batch Sizer" tries to extract this 
information on each batch, but it works with averages, and specific column 
patterns can still throw off the memory calculations. (For example, having a 
series of very wide columns for A-M and very narrow columns for N-Z will cause 
a moderate average. But, once sorted, the A-M rows, and batches, will be much 
larger than expected, causing out-of-memory errors.)

At present, if an operator is wrong in its memory usage by a single byte, the 
entire query is killed. That is, the user pays the death penalty (of queries) 
for poor design decisions within Drill. This leads to a less-than-optimal user 
experience.

The proposal here is to make the memory allocator less strict for "managed" 
operators.

First, we recognize that the managed operators do attempt to control memory 
and, if designed well, will, on average hit their targets.

Second, we recognize that, due to the lumpiness issues above, any single 
operator may exceed, or be under, the configured maximum memory.

Given this, the proposal here is:

1. An operator identifies itself as managed to the memory allocator.
2. In managed mode, the allocator has soft limits. It emits a warning to the 
log when the limit is exceeded.
3. For safety, in managed mode, the allocator enforces a hard limit larger than 
the configured limit.

The enforcement limit might be:

* For memory sizes < 100MB, up to 2x the configured limit.
* For larger memory sizes, no more than 100MB over the configured limit.

The exact numbers can be made configurable.

Now, during testing, scripts should look for over-memory warnings. Each should 
be fixed as we fix OOM issues today. But, during production, user queries are 
far less likely to fail due to any remaining corner cases that throw off the 
memory calculations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5808) Reduce memory allocator strictness for "managed" operators

Reply via email to