[ 
https://issues.apache.org/jira/browse/DRILL-5014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649126#comment-15649126
 ] 

Paul Rogers commented on DRILL-5014:
------------------------------------

Proposed solution.

Set an overall ESB memory limit which is the lesser of:

* The memory given by the allocator for the ESB.
* The memory specified in the {{ExternalSort}} operator specification.
* A new config option {{drill.exec.sot.external.memory-limit}} (mostly for 
testing and special cases.)

As in the current code, subtract off an allowance for the memory-merge operator 
and the "copier."

Provide a new config option to define the high-water mark for buffering: 
{{drill.exec.sort.external.max-buffer-ratio}} which defaults to 0.95 (the value 
hardcoded today.) Spilling will occur when the next batch would cause buffer 
usage to exceed memory budget * reserve-ratio.

Once spilling occurs, spill until memory use drops below the low-water mark: 
{{drill.exec.sort.external.spill.to-buffer-ratio}}.

For testing purposes (and special cases), repurpose 
{{drill.exec.sort.external.spill.threshold}} as an optional lower limit, 
measured in batches, of the high water mark. Similarly, define 
{{drill.exec.sort.external.spill.group.size}} to be the number of batches to 
spill (low water mark is {{threshold - group.size}}.) This new meaning should 
be compatible with the earlier meaning.

In normal operation, the user sets nothing: the Foreman sets the memory budget, 
preconfigured high- and low-water marks control spilling.


> ExternalSortBatch cache size, spill count differs from config setting
> ---------------------------------------------------------------------
>
>                 Key: DRILL-5014
>                 URL: https://issues.apache.org/jira/browse/DRILL-5014
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> The ExternalSortBatch (ESB) operator sorts data while spilling to disk to 
> remain within a defined memory footprint. Spilling occurs based on a number 
> of factors. Among those are two config parameters:
> * {{drill.exec.sort.external.spill.group.size}}: The number of batches to 
> spill per spill event.
> * {{drill.exec.sort.external.spill.threshold}}: The number of batches to 
> accumulate in memory before starting a spill event.
> The expected behavior would be:
> * After the accumulated batches exceeds the threshold, and
> * More than "batch size" new batches have arrived since the last spill,
> * Spill half the accumulated records.
> That is if the threshold is 200, and the size is 150, we should accumulate 
> 200 batches, then spill 150 of them (leaving 50) and repeat.
> The actual behavior is:
> * When the accumulated records exceeds the threshold and,
> * More than "batch size" new batches have arrived since the last spill,
> * Spill half the accumulated records.
> The above can leave more batches in memory than desired, and spill more than 
> desired.
> The actual behavior for the (threshold=200, size=150) case is:
> {code}
> Before spilling, buffered batch count: 201
> After spilling, buffered batch count: 101
> Before spilling, buffered batch count: 251
> After spilling, buffered batch count: 126
> Before spilling, buffered batch count: 276
> After spilling, buffered batch count: 138
> Before spilling, buffered batch count: 288
> After spilling, buffered batch count: 144
> Before spilling, buffered batch count: 294
> After spilling, buffered batch count: 147
> Before spilling, buffered batch count: 297
> After spilling, buffered batch count: 149
> Before spilling, buffered batch count: 299
> After spilling, buffered batch count: 150
> Before spilling, buffered batch count: 300
> After spilling, buffered batch count: 150
> Before spilling, buffered batch count: 300
> {code}
> In short, the actual number of batches retained in memory is twice the spill 
> size, **NOT** the number set in the threshold. As a result, the ESB operator 
> will use more memory than expected.
> The work-around is to set a batch size that is half the threshold so that the 
> batch size (used in spill decisions) matches the actual spill count (as 
> implemented by the code.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to