github-actions[bot] commented on code in PR #64849:
URL: https://github.com/apache/doris/pull/64849#discussion_r3473312876


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateGroupByKey.java:
##########
@@ -79,28 +81,59 @@ public List<Rule> buildRules() {
     }
 
     LogicalAggregate<Plan> eliminateGroupByKey(LogicalAggregate<? extends 
Plan> agg, Set<Slot> requireOutput) {
-        Set<Expression> removeExpression = findCanBeRemovedExpressions(agg, 
requireOutput,
+        FindResult result = findCanBeRemovedExpressionsInternal(agg, 
requireOutput,
                 agg.child().getLogicalProperties().getTrait());
+        Set<Expression> removeExpression = result.removeExpression;
+        Set<Expression> wrapWithAnyValue = result.wrapWithAnyValue;
+
         List<Expression> newGroupExpression = new ArrayList<>();
         for (Expression expression : agg.getGroupByExpressions()) {
-            if (!removeExpression.contains(expression)) {
+            if (!removeExpression.contains(expression)
+                    && !wrapWithAnyValue.contains(expression)) {
                 newGroupExpression.add(expression);
             }
         }
         List<NamedExpression> newOutput = new ArrayList<>();
         for (NamedExpression expression : agg.getOutputExpressions()) {
-            if (!removeExpression.contains(expression)) {
-                newOutput.add(expression);
+            if (removeExpression.contains(expression)) {
+                continue;
+            }
+            if (wrapWithAnyValue.contains(expression)) {
+                // expression is FD-redundant but needed in output: wrap with 
any_value
+                expression = new Alias(expression.getExprId(),
+                        new AnyValue(expression.toSlot()), 
expression.getName());
             }
+            newOutput.add(expression);
         }
         return agg.withGroupByAndOutput(newGroupExpression, newOutput);
     }

Review Comment:
   Returning `wrapWithAnyValue` here leaks the new direct-rule behavior into 
the old public API. `PushDownAggThroughJoinOnPkFk.eliminatePrimaryOutput()` 
still treats this method's result as safe to remove from the pushed-down 
aggregate GROUP BY and does not wrap retained outputs. For example, in a PK/FK 
join where the foreign side also has `foreign.id2 -> foreign.name`, a query 
grouped by `pri.id1, foreign.id2, foreign.name` and selecting `foreign.name` 
will mark `foreign.name` as wrap-only here. The pushdown caller then drops it 
from `minGroupBySlotList`, rewrites `pri.id1` to the foreign key, and keeps the 
`foreign.name` output as a bare slot because it is not a primary-side output. 
The new aggregate below the join can therefore output `foreign.name` without 
grouping by it or wrapping it in `any_value`. Please keep this public helper 
returning only expressions that are removable by existing external callers, or 
update the PK/FK pushdown path to consume the split result and add the 
 same `ANY_VALUE` output rewrite.



##########
fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/EliminateGroupByKeyTest.java:
##########
@@ -181,6 +181,24 @@ void testEliminateByEqual() {
                                 && 
agg.getGroupByExpressions().get(0).toSql().equals("name")));
     }
 
+    @Test
+    void testEliminateByPkWithOutputNeeded() throws Exception {
+        // Regression: when a group-by key (name) is FD-redundant (id -> name)
+        // but still appears in SELECT, it should be removed from group-by
+        // and wrapped with ANY_VALUE in the output.
+        addConstraint("alter table t1 add constraint pk2 primary key (id)");
+        PlanChecker.from(connectContext)
+                .analyze("select id, name, count(*) from t1 group by id, name")
+                .rewrite()
+                .printlnTree()
+                .matches(logicalAggregate().when(agg ->
+                        agg.getGroupByExpressions().size() == 1
+                                && 
agg.getGroupByExpressions().get(0).toSql().equals("id")
+                                && 
agg.getOutputExpressions().stream().anyMatch(
+                                        e -> e.child(0) instanceof 
org.apache.doris.nereids.trees.expressions.functions.agg.AnyValue)));

Review Comment:
   This assertion can fail before it proves the new rewrite. 
`getOutputExpressions()` also contains leaf `Slot` outputs such as the grouped 
`id`, and `Slot` has no children while `child(0)` is a direct list access. If 
the stream visits that slot before the `ANY_VALUE` alias, the matcher throws 
`IndexOutOfBoundsException` instead of checking the plan. Please guard the 
shape first, for example by importing `Alias`/`AnyValue` and checking `e 
instanceof Alias && e.child(0) instanceof AnyValue` (ideally also checking it 
is the rewritten `name` output).



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateGroupByKey.java:
##########
@@ -79,28 +81,59 @@ public List<Rule> buildRules() {
     }
 
     LogicalAggregate<Plan> eliminateGroupByKey(LogicalAggregate<? extends 
Plan> agg, Set<Slot> requireOutput) {
-        Set<Expression> removeExpression = findCanBeRemovedExpressions(agg, 
requireOutput,
+        FindResult result = findCanBeRemovedExpressionsInternal(agg, 
requireOutput,
                 agg.child().getLogicalProperties().getTrait());
+        Set<Expression> removeExpression = result.removeExpression;
+        Set<Expression> wrapWithAnyValue = result.wrapWithAnyValue;
+
         List<Expression> newGroupExpression = new ArrayList<>();
         for (Expression expression : agg.getGroupByExpressions()) {
-            if (!removeExpression.contains(expression)) {
+            if (!removeExpression.contains(expression)
+                    && !wrapWithAnyValue.contains(expression)) {
                 newGroupExpression.add(expression);

Review Comment:
   This creates the exact ExprId shape that the parallel uniform-key rewrite 
avoids: for a wrapped slot `name#x`, the aggregate output becomes 
`any_value(name#x) AS name#x`, so the same ExprId names both the child input 
slot and a new aggregate output expression. `EliminateGroupByKeyByUniform` has 
a local comment explaining that `any_value(a#0) AS a#0` violates ExprId 
uniqueness and can break modules such as MV rewrite, which is why that rule 
allocates a fresh alias ExprId and rewrites upper references. Please follow the 
same pattern here, or otherwise keep the original grouping key until the upper 
references can be rewritten safely.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to