morrySnow commented on code in PR #62742:
URL: https://github.com/apache/doris/pull/62742#discussion_r3360854172


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java:
##########
@@ -269,6 +271,85 @@ public <T extends Expression> Optional<Pair<List<T>, 
LogicalProject<Plan>>> rewr
         return Optional.of(Pair.of(newTargetsBuilder.build(), new 
LogicalProject<>(projects, plan.child(0))));
     }
 
+    private Optional<JoinRewriteResult> 
rewriteJoinExpressions(LogicalJoin<Plan, Plan> join,
+            Collection<Expression> targets) {
+        Map<Expression, Integer> volatileExpressionCounter = 
Maps.newLinkedHashMap();
+        Map<Expression, Set<Slot>> volatileExpressionSlots = 
Maps.newLinkedHashMap();
+        for (Expression target : targets) {
+            target.foreach(e -> {
+                Expression expr = (Expression) e;
+                if (expr.isVolatile()) {
+                    volatileExpressionCounter.merge(expr, 1, Integer::sum);
+                    Set<Slot> volatileInputSlots = expr.getInputSlots();
+                    volatileExpressionSlots
+                            .computeIfAbsent(expr, ignored -> 
Sets.newLinkedHashSet())
+                            .addAll(volatileInputSlots.isEmpty() ? 
target.getInputSlots() : volatileInputSlots);
+                }
+            });
+        }
+
+        ImmutableList.Builder<NamedExpression> leftAliases = 
ImmutableList.builder();
+        ImmutableList.Builder<NamedExpression> rightAliases = 
ImmutableList.builder();
+        Map<Expression, Slot> replaceMap = Maps.newHashMap();
+        Set<Slot> leftOutputSet = join.left().getOutputSet();
+        Set<Slot> rightOutputSet = join.right().getOutputSet();
+        for (Entry<Expression, Integer> entry : 
volatileExpressionCounter.entrySet()) {
+            if (entry.getValue() <= 1) {
+                continue;
+            }
+            Set<Slot> inputSlots = volatileExpressionSlots.get(entry.getKey());
+            Set<Slot> volatileInputSlots = entry.getKey().getInputSlots();
+            if (!volatileInputSlots.isEmpty()
+                    && !leftOutputSet.containsAll(inputSlots)
+                    && !rightOutputSet.containsAll(inputSlots)) {
+                continue;
+            }
+            ExprId exprId = StatementScopeIdGenerator.newExprId();
+            String functionName = entry.getKey() instanceof Function
+                    ? ((Function) entry.getKey()).getName() : "volatile";
+            Alias alias = new Alias(exprId, entry.getKey(), "$_" + 
functionName + "_" + exprId.asInt() + "_$");
+            replaceMap.put(alias.child(0), alias.toSlot());
+            // Join can not add a project at join-pair scope, but repeated 
volatile expressions
+            // still need one materialized value. Slot-free volatile functions 
use the containing
+            // conjunct's slots to choose a side, so t2.k + rand() can project 
rand() on the right.
+            // Volatile functions with input slots use their own slots to 
avoid projecting
+            // volatile_udf(t2.k) on the left only because its containing 
conjunct also uses t1.
+            // Volatile functions whose own slots span both join children 
cannot be projected into
+            // either child, so they are not rewritten here.
+            // Put right-only expressions on the right child; otherwise keep 
the previous
+            // left-child behavior as the conservative default.
+            if (!inputSlots.isEmpty() && 
rightOutputSet.containsAll(inputSlots)) {
+                rightAliases.add(alias);
+            } else {

Review Comment:
   This right-side materialization is still unsafe for side-specific volatile 
ON predicates, not only for the mixed-slot case discussed above. For example, 
`t1 join t2 on t1.k = t2.k and t2.v + rand() between 0.1 and 0.5`: if two `t1` 
rows match the same `t2` row, the original ON predicate evaluates `rand()` once 
per joined pair, so one pair can pass while the other fails. After this 
rewrite, `rand()` is evaluated once in the right child project/filter and then 
reused for every matching left row, so the plan can only keep both pairs or 
drop both. That is the same granularity change that 
`PushDownJoinOtherCondition` now avoids for direct volatile ON conjuncts. If we 
cannot materialize at join-pair scope, this join rewrite should skip repeated 
volatile ON expressions instead of moving them into either child; the 
`join_other_unique_6` expected plan currently locks in the wrong behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to