github-actions[bot] commented on code in PR #63690:
URL: https://github.com/apache/doris/pull/63690#discussion_r3435933348


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/EagerAggRewriter.java:
##########
@@ -420,10 +594,34 @@ but if agg can not pushdown through child2, the output of 
child2 is (a2, b2).
                 newOutput.add(alias.toSlot());
             }
             newOutput.addAll(context.getGroupKeys());
+            Optional<Alias> countStarAlias = findCountStarAlias(context);
+            Optional<SlotReference> unionCnt = Optional.empty();
+            if (context.needOutputCount() && !countStarAlias.isPresent()) {
+                unionCnt = Optional.of(new SlotReference(
+                        "unionCnt" + 
context.getCascadesContext().getStatementContext().generateColumnName(),
+                        BigIntType.INSTANCE));
+                newOutput.add(unionCnt.get());
+            }
             LogicalUnion newUnion = (LogicalUnion) union
                     .withChildrenAndOutputs(newChildren, newOutput, 
newRegularChildrenOutputs);
+            for (int idx = 0; idx < context.getAggFunctions().size(); idx++) {
+                AggregateFunction func = context.getAggFunctions().get(idx);
+                ExprId exprId = context.getAliasMap().get(func).getExprId();
+                context.getBilateralState().registerPushedAggFuncSlot(exprId, 
newUnion.getOutput().get(idx));
+            }
+            if (context.needOutputCount()) {
+                if (countStarAlias.isPresent()) {
+                    context.getBilateralState().registerCountSlot(newUnion,
+                            
newUnion.getOutput().get(findCountStarAggFunctionIndex(context).get()));
+                } else {
+                    context.getBilateralState().registerCountSlot(newUnion, 
unionCnt.get());
+                }
+            } else {
+                context.getBilateralState().registerNoCountSlot(newUnion);
+            }
             return newUnion;
         } else {
+            context.getBilateralState().registerNoCountSlot(union);

Review Comment:
   This guard only checks aggregate functions after project substitution; 
projected group keys can still hide volatile expressions. Reduced tree:
   
   ```text
   Aggregate(sum(x) AS s, group by g)
     Project(l.v AS x, random() AS g, l.k)
       Join(l.k = r.k)
         Scan l
         Scan r
   ```
   
   `createContextFromProject()` turns `g` into `random().getInputSlots()`, 
which is empty, while `sum(x)` becomes `sum(l.v)`. Since the new context still 
has an aggregate function, this line does not bail out, and 
`visitLogicalJoin()` can pre-aggregate below the join grouped only by join 
keys. The rebuilt project then evaluates `random()` after pre-aggregation/join 
instead of once per original joined row, changing the grouping and result. 
Please reject volatile project expressions that feed `context.getGroupKeys()` 
before pushing through the project, similar to the aggregate-input volatility 
check.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/EagerAggRewriter.java:
##########
@@ -445,10 +643,9 @@ public Plan visitLogicalProject(LogicalProject<? extends 
Plan> project, PushDown
         if (!canPushThroughProject(project, context)) {
             return genAggregate(project, context);
         }
-
         PushDownAggContext newContext = createContextFromProject(project, 
context);
-        if (!newContext.isValid()) {
-            return project;
+        if (newContext.noGroupKeyAndNoAggFunc() || 
newContext.containsVolatileAggregateFunction()) {

Review Comment:
   This guard only checks aggregate functions after project substitution; 
projected group keys can still hide volatile expressions. Reduced tree:
   
   ```text
   Aggregate(sum(x) AS s, group by g)
     Project(l.v AS x, random() AS g, l.k)
       Join(l.k = r.k)
         Scan l
         Scan r
   ```
   
   `createContextFromProject()` turns `g` into `random().getInputSlots()`, 
which is empty, while `sum(x)` becomes `sum(l.v)`. Since the new context still 
has an aggregate function, this line does not bail out, and 
`visitLogicalJoin()` can pre-aggregate below the join grouped only by join 
keys. The rebuilt project then evaluates `random()` after pre-aggregation/join 
instead of once per original joined row, changing the grouping and result. 
Please reject volatile project expressions that feed `context.getGroupKeys()` 
before pushing through the project, similar to the aggregate-input volatility 
check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to