clintropolis commented on code in PR #14748:
URL: https://github.com/apache/druid/pull/14748#discussion_r1300877178
##########
processing/src/main/java/org/apache/druid/query/aggregation/NullableNumericAggregatorFactory.java:
##########
@@ -85,6 +85,12 @@ public final int getMaxIntermediateSizeWithNulls()
return getMaxIntermediateSize() + (NullHandling.replaceWithDefault() ? 0 :
Byte.BYTES);
}
+ protected boolean canHandleNulls()
+ {
+ return NullHandling.replaceWithDefault();
+ }
Review Comment:
This reminds me of an old PR of mine which I totally forgot about until
reviewing this PR, #10666, which made a different but somewhat similar change.
In that PR, I was adding a method called `hasNulls` to the nullable numeric
aggregator (side note, i probably should resurrect this old PR), to separate
processing in SQL compatible mode for columns which have nulls and those which
don't, but which should still output a null value if not aggregated.
The changes in that PR would not actually fix the problems here, and I think
maybe help me frame what actually bothers me about the changes in this PR,
since the changes here are somewhat contradictory to the changes there.
Anyway, I think what bothers me about this change is that it is to handle
logic that seems entirely specific to the special case of how COUNT
aggregations are handled, and I can't actually see any good use for this
otherwise, and because of that I'm not entirely sure it makes sense to add in
this way. If we ignore the default value mode (which is now a legacy mode as of
#14792), and consider the changes i made in the other PR, then this family of
aggregators is for implementing math aggregator functions which may or may not
have null values, but should always be null unless aggregated.
COUNT however, is a special case in that it should be 0, even if not
aggregated, so I think adding special handling to this family aggregators
specifically for the 'merging' aggregator for COUNT seems like the wrong thing
to do.
Instead, I think i would be more in favor of just special case implementing
the merging/combining aggregator for COUNT, even if it means not sharing some
of the same code that we use for SUM, because it is a special case that i can't
see having a general use for other types of numeric aggregators, only COUNT.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]