Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/938#discussion_r137939203
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
---
@@ -109,14 +107,21 @@
private boolean isTwoPhase = false; // 1 phase or 2 phase aggr?
private boolean is2ndPhase = false;
- private boolean canSpill = true; // make it false in case can not spill
+ private boolean is1stPhase = false;
+ private boolean canSpill = true; // make it false in case can not
spill/return-early
private ChainedHashTable baseHashTable;
private boolean earlyOutput = false; // when 1st phase returns a
partition due to no memory
private int earlyPartition = 0; // which partition to return early
-
- private long memoryLimit; // max memory to be used by this oerator
- private long estMaxBatchSize = 0; // used for adjusting #partitions
- private long estRowWidth = 0;
+ private boolean retrySameIndex = false; // in case put failed during 1st
phase - need to output early, then retry
+ private boolean useMemoryPrediction = false; // whether to use memory
prediction to decide when to spill
+ private long estMaxBatchSize = 0; // used for adjusting #partitions and
deciding when to spill
+ private long estRowWidth = 0; // the size of the internal "row" (keys +
values + extra columns)
+ private long estValuesRowWidth = 0; // the size of the internal values (
values + extra )
+ private long estOutputRowWidth = 0; // the size of the output "row" (no
extra columns)
+ private long estValuesBatchSize = 0; // used for "reserving" memory for
the Values batch to overcome an OOM
+ private long estOutgoingAllocSize = 0; // used for "reserving" memory
for the Outgoing Output Values to overcome an OOM
+ private long reserveValueBatchMemory; // keep "reserve memory" for
Values Batch
+ private long reserveOutgoingMemory; // keep "reserve memory" for the
Outgoing (Values only) output
--- End diff --
Long lists of member variables are generally frowned upon. Can't unit test
them. Too many states to keep in mind. Can these be grouped into a read-only
config class (set up front, then never changed) vs, running estimates?
---