Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/938#discussion_r137939361
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
---
@@ -1335,7 +1470,7 @@ private void updateStats(HashTable[] htables) {
}
if ( rowsReturnedEarly > 0 ) {
stats.setLongStat(Metric.SPILL_MB, // update stats - est. total MB
returned early
- (int) Math.round( rowsReturnedEarly * estRowWidth / 1024.0D /
1024.0));
+ (int) Math.round( rowsReturnedEarly * estOutputRowWidth /
1024.0D / 1024.0));
--- End diff --
This file is a template. This means, we copy *all* this code each time we
generate a new class. How is doing so helping stability, customer value or
performance? Should all this code be in a template that is copied on every
query? Or, should it be refactored into a driver class, with only a very light
wrapper appearing in the copied template?
As this code get ever more complex, it puts a strain on the Java code that
must walk though this code and do method fixup, scalar replacements, etc. That
work takes time. What value accrues to the user from doing this fixup on code
that never changes from one query to the next?
Filed [DRILL-5779](https://issues.apache.org/jira/browse/DRILL-5779) for
this issue.
---