Github user Ben-Zvi commented on a diff in the pull request:
https://github.com/apache/drill/pull/938#discussion_r138435187
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
---
@@ -382,19 +390,25 @@ private void delayedSetup() {
final boolean fallbackEnabled =
context.getOptions().getOption(ExecConstants.HASHAGG_FALLBACK_ENABLED_KEY).bool_val;
// Set the number of partitions from the configuration (raise to a
power of two, if needed)
- numPartitions =
context.getConfig().getInt(ExecConstants.HASHAGG_NUM_PARTITIONS);
- if ( numPartitions == 1 ) {
+ numPartitions =
(int)context.getOptions().getOption(ExecConstants.HASHAGG_NUM_PARTITIONS_VALIDATOR);
+ if ( numPartitions == 1 && is2ndPhase ) { // 1st phase can still do
early return with 1 partition
canSpill = false;
logger.warn("Spilling is disabled due to configuration setting of
num_partitions to 1");
}
numPartitions = BaseAllocator.nextPowerOfTwo(numPartitions); // in
case not a power of 2
- if ( schema == null ) { estMaxBatchSize = 0; } // incoming was an
empty batch
+ if ( schema == null ) { estValuesBatchSize = estOutgoingAllocSize =
estMaxBatchSize = 0; } // incoming was an empty batch
else {
// Estimate the max batch size; should use actual data (e.g. lengths
of varchars)
updateEstMaxBatchSize(incoming);
}
- long memAvail = memoryLimit - allocator.getAllocatedMemory();
+ // create "reserved memory" and adjust the memory limit down
+ reserveValueBatchMemory = reserveOutgoingMemory = estValuesBatchSize ;
+ long newMemoryLimit = allocator.getLimit() - reserveValueBatchMemory -
reserveOutgoingMemory ;
+ long memAvail = newMemoryLimit - allocator.getAllocatedMemory();
+ if ( memAvail <= 0 ) { throw new OutOfMemoryException("Too little
memory available"); }
+ allocator.setLimit(newMemoryLimit);
+
--- End diff --
Yes indeed. The only attempt to force a code path is the new testing option
*use_memory_prediction* which can disable the estimate based prediction (when
to spill), hence forcing the code path that relies on an OOM (for hash table
put() ) to take place (one unit test was added for that).
Getting a full code coverage would be ideal, but hard.
---