tanishqgandhi1908 commented on code in PR #5896:
URL: https://github.com/apache/texera/pull/5896#discussion_r3456580174
##########
common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/aggregate/AggregateOpExec.scala:
##########
@@ -47,9 +47,14 @@ class AggregateOpExec(descString: String) extends
OperatorExecutor {
// Initialize distributedAggregations if it's not yet initialized
if (distributedAggregations == null) {
- distributedAggregations = desc.aggregations.map(agg =>
- agg.getAggFunc(tuple.getSchema.getAttribute(agg.attribute).getType)
- )
+ distributedAggregations = desc.aggregations.map { agg =>
+ // COUNT(*) has a blank attribute and no input column to look up; pass
a null
+ // type since its result type does not depend on any input attribute.
+ val attrType =
+ if (agg.attribute == null || agg.attribute.trim.isEmpty) null
+ else tuple.getSchema.getAttribute(agg.attribute).getType
+ agg.getAggFunc(attrType)
+ }
Review Comment:
fixed. The guard now keys off agg.aggFunction == COUNT_STAR instead of a
blank attribute, so COUNT(*) never looks up an input column even if a
stale/leaked attribute is present. Hardened the end-to-end executor test to use
an attribute that doesn't exist in the schema, confirming no lookup happens.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]