github-actions[bot] commented on code in PR #63763:
URL: https://github.com/apache/doris/pull/63763#discussion_r3435864545
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AggScalarSubQueryToWindowFunction.java:
##########
@@ -139,7 +140,8 @@ && checkAggregate()
&& checkJoin()
&& checkProject()
&& checkRelation(apply.getCorrelationSlot())
- && checkFilter(outerFilter);
+ && checkFilter(outerFilter)
+ && checkUniqueCorrelatedTable(apply.getCorrelationSlot());
}
Review Comment:
This new uniqueness check proves that the outer-only table has at most one
row per correlated key, but the rewrite still accepts plans where the top outer
filter has extra predicates on a relation that is also scanned inside the
subquery. Reduced plan:
```text
Filter(f.v > 6, f.v * 2 > sum_alias)
Apply(correlation: d.k)
CrossJoin
Scan fact f
Scan dim d -- d.k is unique, so this new check passes
Aggregate(sum(f2.v) AS sum_alias)
Filter(f2.k = d.k)
Scan fact f2
```
`checkFilter` only proves that the inner conjunct `f2.k = d.k` is present in
the outer filter after slot replacement. It does not reject the unmatched outer
conjunct `f.v > 6`, and `rewrite` puts all `conjuncts.get(true)` predicates
into `newFilter` below `LogicalWindow`. The generated `SUM(v) OVER (PARTITION
BY d.k)` therefore sees only `fact` rows with `f.v > 6`, while the original
scalar subquery sums all `fact` rows for the same `d.k`.
For example, with one unique `dim` row `k=1` and `fact` values `5, 6, 7`,
the original subquery sum is `18`, so row `v=7` fails `7 * 2 > 18`; after this
rewrite the window sum is `7`, so the row is returned. Please either reject
unmatched outer predicates that reference any shared/inner relation slot, or
split predicates so only outer-only predicates are applied below the window
while shared-side filters remain above it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]