[ https://issues.apache.org/jira/browse/DRILL-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180053#comment-16180053 ]
ASF GitHub Bot commented on DRILL-5808: --------------------------------------- Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/958#discussion_r140938324 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/Accountant.java --- @@ -80,6 +119,23 @@ public Accountant(Accountant parent, long reservation, long maxAllocation) { } /** + * Request lenient allocations: allows exceeding the allocation limit + * by the configured grace amount. The request is granted only if strict + * limits are not required. + * + * @param enable + */ + public boolean setLenient() { --- End diff -- Why "set" returns a boolean ? Should better have a separate "get". > Reduce memory allocator strictness for "managed" operators > ---------------------------------------------------------- > > Key: DRILL-5808 > URL: https://issues.apache.org/jira/browse/DRILL-5808 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.11.0 > Reporter: Paul Rogers > Assignee: Paul Rogers > Fix For: 1.12.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > Drill 1.11 and 1.12 introduce new "managed" versions of the sort and hash agg > that enforce memory limits, spilling to disk when necessary. > Drill's internal memory system is very "lumpy" and unpredictable. The > operators have no control over the incoming batch size; an overly large batch > can cause the operator to exceed its memory limit before it has a chance to > do any work. > Vector allocations grow in power-of-two sizes. Adding a single record can > double the memory allocated to a vector. > Drill has no metadata, so operators cannot predict the size of VarChar > columns nor the cardinality of arrays. The "Record Batch Sizer" tries to > extract this information on each batch, but it works with averages, and > specific column patterns can still throw off the memory calculations. (For > example, having a series of very wide columns for A-M and very narrow columns > for N-Z will cause a moderate average. But, once sorted, the A-M rows, and > batches, will be much larger than expected, causing out-of-memory errors.) > At present, if an operator is wrong in its memory usage by a single byte, the > entire query is killed. That is, the user pays the death penalty (of queries) > for poor design decisions within Drill. This leads to a less-than-optimal > user experience. > The proposal here is to make the memory allocator less strict for "managed" > operators. > First, we recognize that the managed operators do attempt to control memory > and, if designed well, will, on average hit their targets. > Second, we recognize that, due to the lumpiness issues above, any single > operator may exceed, or be under, the configured maximum memory. > Given this, the proposal here is: > 1. An operator identifies itself as managed to the memory allocator. > 2. In managed mode, the allocator has soft limits. It emits a warning to the > log when the limit is exceeded. > 3. For safety, in managed mode, the allocator enforces a hard limit larger > than the configured limit. > The enforcement limit might be: > * For memory sizes < 100MB, up to 2x the configured limit. > * For larger memory sizes, no more than 100MB over the configured limit. > The exact numbers can be made configurable. > Now, during testing, scripts should look for over-memory warnings. Each > should be fixed as we fix OOM issues today. But, during production, user > queries are far less likely to fail due to any remaining corner cases that > throw off the memory calculations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)