This is an automated email from the ASF dual-hosted git repository.
morrySnow pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 8255f94bc5f [fix](nereids) Guard UniqueFunction in multiple
filter/topn pushdown rules (#62742)
8255f94bc5f is described below
commit 8255f94bc5fbee08c45f133a3d2fad87667e3e03
Author: yujun <[email protected]>
AuthorDate: Mon Jun 8 18:00:44 2026 +0800
[fix](nereids) Guard UniqueFunction in multiple filter/topn pushdown rules
(#62742)
### What problem does this PR solve?
Problem Summary:
Several Nereids rewrite rules still moved predicates
containing non-idempotent (unique) functions such as rand() / uuid() /
random_bytes() across operator boundaries in ways that changed query
semantics. The common root cause is that a predicate like `rand() > 0.5`
has an empty input-slot set, so the `containsAll(emptySet)` / `allMatch`
guards used by these rules silently returned true and allowed unsafe
push
down / elimination.
This PR adds `containsUniqueFunction()` guards to the following rules:
1. PushDownFilterThroughRepeat: skip conjuncts with unique functions.
Pushing `rand() > x` below Repeat changes which rows feed each
grouping set and alters aggregate results.
2. PushDownFilterThroughWindow: skip conjuncts with unique functions.
Pushing a unique predicate below a window operator re-samples the
base rows and changes every window-function value.
3. PushDownFilterThroughPartitionTopN: same as Window - skip unique
conjuncts in the split loop.
4. PushDownFilterThroughSetOperation: do not push volatile conjuncts
below `UNION DISTINCT`, `INTERSECT`, or `EXCEPT`; only `UNION ALL`
keeps the original row-to-row semantics.
5. PushDownJoinOtherCondition: keep volatile ON predicates in the join
when pushing them into a single child would change evaluation from
per joined pair to per input row.
6. AddProjectForVolatileExpression: when a volatile expression is
repeated after rewrites such as BETWEEN expansion, materialize it via
a child project so repeated references share one value instead of
being re-evaluated independently.
7. InferPredicates / JoinUtils.isHashJoinCondition: add the same unique
function guards to avoid deriving or classifying unsafe predicates.
### Release note
Fix wrong results when predicates containing rand(), uuid(),
random_bytes(), or uuid_numeric() are pushed across Repeat, Window,
PartitionTopN, SetOperation, or Join boundaries, and when repeated
volatile expressions are materialized for reuse.
---------
Co-authored-by: yujun777 <[email protected]>
Co-authored-by: Copilot <[email protected]>
---
.../doris/nereids/exceptions/CastException.java | 4 +-
.../rewrite/AddProjectForVolatileExpression.java | 105 ++++++++++++++-
.../nereids/rules/rewrite/InferPredicates.java | 13 ++
...ProjectOtherJoinConditionForNestedLoopJoin.java | 8 ++
.../PushDownFilterThroughPartitionTopN.java | 8 +-
.../rewrite/PushDownFilterThroughSetOperation.java | 82 +++++++++---
.../rules/rewrite/PushDownFilterThroughWindow.java | 13 +-
.../rules/rewrite/PushDownJoinOtherCondition.java | 8 +-
.../org/apache/doris/nereids/util/JoinUtils.java | 4 +
.../AddProjectForVolatileExpressionTest.java | 149 +++++++++++++++++++--
.../rewrite/FindHashConditionForJoinTest.java | 22 +++
.../rewrite/PushDownFilterThroughWindowTest.java | 40 ++++++
.../rewrite/PushDownJoinOtherConditionTest.java | 29 ++++
.../apache/doris/nereids/util/JoinUtilsTest.java | 19 +++
.../extend_infer_equal_predicate.out | 10 +-
.../add_project_for_unique_function.out | 13 +-
...nfer_predicates_set_op_with_unique_function.out | 47 +++++++
...join_condition_for_nlj_with_unique_function.out | 36 +++++
...wn_filter_through_join_with_unique_function.out | 9 ++
...through_partition_topn_with_unique_function.out | 17 +++
..._through_set_operation_with_unique_function.out | 74 ++++++++++
..._filter_through_window_with_unique_function.out | 23 ++++
...n_join_other_condition_with_unique_function.out | 53 ++++++++
.../add_project_for_unique_function.groovy | 1 +
...r_predicates_set_op_with_unique_function.groovy | 66 +++++++++
...n_condition_for_nlj_with_unique_function.groovy | 58 ++++++++
...filter_through_join_with_unique_function.groovy | 10 ++
...ough_partition_topn_with_unique_function.groovy | 47 +++++++
...rough_set_operation_with_unique_function.groovy | 75 +++++++++++
...lter_through_window_with_unique_function.groovy | 56 ++++++++
...oin_other_condition_with_unique_function.groovy | 73 ++++++++++
31 files changed, 1121 insertions(+), 51 deletions(-)
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/exceptions/CastException.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/exceptions/CastException.java
index ab8b191217d..522062fae6e 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/exceptions/CastException.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/exceptions/CastException.java
@@ -17,8 +17,6 @@
package org.apache.doris.nereids.exceptions;
-import java.util.Optional;
-
/**
* cast exception.
*/
@@ -27,7 +25,7 @@ public class CastException extends AnalysisException {
private final String message;
public CastException(String message) {
- super(ErrorCode.NONE, message, Optional.of(0), Optional.of(0));
+ super(ErrorCode.NONE, message);
this.message = message;
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java
index 7f3f0cd7684..63965a4b0db 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java
@@ -45,12 +45,14 @@ import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Lists;
import com.google.common.collect.Maps;
+import com.google.common.collect.Sets;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Optional;
+import java.util.Set;
/** extract volatile expression which exist multiple times, and add them to a
new project child.
* for example:
@@ -207,20 +209,20 @@ public class AddProjectForVolatileExpression implements
RewriteRuleFactory {
allConjuncts.addAll(join.getHashJoinConjuncts());
allConjuncts.addAll(join.getOtherJoinConjuncts());
allConjuncts.addAll(join.getMarkJoinConjuncts());
- Optional<Pair<List<Expression>, LogicalProject<Plan>>>
rewrittenOpt
- = rewriteExpressions(join, allConjuncts);
+ Optional<JoinRewriteResult> rewrittenOpt =
rewriteJoinExpressions(join, allConjuncts);
if (!rewrittenOpt.isPresent()) {
return join;
}
- LogicalProject<Plan> newLeftChild = rewrittenOpt.get().second;
- List<Expression> newAllConjuncts = rewrittenOpt.get().first;
+ Plan newLeftChild = rewrittenOpt.get().left;
+ Plan newRightChild = rewrittenOpt.get().right;
+ List<Expression> newAllConjuncts =
rewrittenOpt.get().newConjuncts;
List<Expression> newHashOtherConjuncts =
newAllConjuncts.subList(0, hashOtherConjunctsSize);
List<Expression> newMarkJoinConjuncts = ImmutableList.copyOf(
newAllConjuncts.subList(hashOtherConjunctsSize,
totalConjunctsSize));
// TODO: code from FindHashConditionForJoin
Pair<List<Expression>, List<Expression>> pair =
JoinUtils.extractExpressionForHashTable(
- newLeftChild.getOutput(), join.right().getOutput(),
newHashOtherConjuncts);
+ newLeftChild.getOutput(), newRightChild.getOutput(),
newHashOtherConjuncts);
List<Expression> newHashJoinConjuncts = pair.first;
List<Expression> newOtherJoinConjuncts = pair.second;
JoinType joinType = join.getJoinType();
@@ -233,7 +235,7 @@ public class AddProjectForVolatileExpression implements
RewriteRuleFactory {
newMarkJoinConjuncts,
join.getDistributeHint(),
join.getMarkJoinSlotReference(),
- ImmutableList.of(newLeftChild, join.right()),
+ ImmutableList.of(newLeftChild, newRightChild),
join.getJoinReorderContext());
}).toRule(RuleType.ADD_PROJECT_FOR_VOLATILE_EXPRESSION);
}
@@ -269,6 +271,85 @@ public class AddProjectForVolatileExpression implements
RewriteRuleFactory {
return Optional.of(Pair.of(newTargetsBuilder.build(), new
LogicalProject<>(projects, plan.child(0))));
}
+ private Optional<JoinRewriteResult>
rewriteJoinExpressions(LogicalJoin<Plan, Plan> join,
+ Collection<Expression> targets) {
+ Map<Expression, Integer> volatileExpressionCounter =
Maps.newLinkedHashMap();
+ Map<Expression, Set<Slot>> volatileExpressionSlots =
Maps.newLinkedHashMap();
+ for (Expression target : targets) {
+ target.foreach(e -> {
+ Expression expr = (Expression) e;
+ if (expr.isVolatile()) {
+ volatileExpressionCounter.merge(expr, 1, Integer::sum);
+ Set<Slot> volatileInputSlots = expr.getInputSlots();
+ volatileExpressionSlots
+ .computeIfAbsent(expr, ignored ->
Sets.newLinkedHashSet())
+ .addAll(volatileInputSlots.isEmpty() ?
target.getInputSlots() : volatileInputSlots);
+ }
+ });
+ }
+
+ ImmutableList.Builder<NamedExpression> leftAliases =
ImmutableList.builder();
+ ImmutableList.Builder<NamedExpression> rightAliases =
ImmutableList.builder();
+ Map<Expression, Slot> replaceMap = Maps.newHashMap();
+ Set<Slot> leftOutputSet = join.left().getOutputSet();
+ Set<Slot> rightOutputSet = join.right().getOutputSet();
+ for (Entry<Expression, Integer> entry :
volatileExpressionCounter.entrySet()) {
+ if (entry.getValue() <= 1) {
+ continue;
+ }
+ Set<Slot> inputSlots = volatileExpressionSlots.get(entry.getKey());
+ Set<Slot> volatileInputSlots = entry.getKey().getInputSlots();
+ if (!volatileInputSlots.isEmpty()
+ && !leftOutputSet.containsAll(inputSlots)
+ && !rightOutputSet.containsAll(inputSlots)) {
+ continue;
+ }
+ ExprId exprId = StatementScopeIdGenerator.newExprId();
+ String functionName = entry.getKey() instanceof Function
+ ? ((Function) entry.getKey()).getName() : "volatile";
+ Alias alias = new Alias(exprId, entry.getKey(), "$_" +
functionName + "_" + exprId.asInt() + "_$");
+ replaceMap.put(alias.child(0), alias.toSlot());
+ // Join can not add a project at join-pair scope, but repeated
volatile expressions
+ // still need one materialized value. Slot-free volatile functions
use the containing
+ // conjunct's slots to choose a side, so t2.k + rand() can project
rand() on the right.
+ // Volatile functions with input slots use their own slots to
avoid projecting
+ // volatile_udf(t2.k) on the left only because its containing
conjunct also uses t1.
+ // Volatile functions whose own slots span both join children
cannot be projected into
+ // either child, so they are not rewritten here.
+ // Put right-only expressions on the right child; otherwise keep
the previous
+ // left-child behavior as the conservative default.
+ if (!inputSlots.isEmpty() &&
rightOutputSet.containsAll(inputSlots)) {
+ rightAliases.add(alias);
+ } else {
+ leftAliases.add(alias);
+ }
+ }
+ if (replaceMap.isEmpty()) {
+ return Optional.empty();
+ }
+
+ List<NamedExpression> leftAliasList = leftAliases.build();
+ List<NamedExpression> rightAliasList = rightAliases.build();
+ Plan left = appendProjectIfNeeded(join.left(), leftAliasList);
+ Plan right = appendProjectIfNeeded(join.right(), rightAliasList);
+ ImmutableList.Builder<Expression> newTargetsBuilder =
ImmutableList.builderWithExpectedSize(targets.size());
+ for (Expression target : targets) {
+ newTargetsBuilder.add(ExpressionUtils.replace(target, replaceMap));
+ }
+ return Optional.of(new JoinRewriteResult(newTargetsBuilder.build(),
left, right));
+ }
+
+ private Plan appendProjectIfNeeded(Plan child, List<NamedExpression>
aliases) {
+ if (aliases.isEmpty()) {
+ return child;
+ }
+ List<NamedExpression> projects =
ImmutableList.<NamedExpression>builder()
+ .addAll(child.getOutput())
+ .addAll(aliases)
+ .build();
+ return new LogicalProject<>(projects, child);
+ }
+
/**
* if a volatile expression exists multiple times in the targets, then add
a project to alias it.
*/
@@ -298,4 +379,16 @@ public class AddProjectForVolatileExpression implements
RewriteRuleFactory {
return builder.build();
}
+
+ private static class JoinRewriteResult {
+ private final List<Expression> newConjuncts;
+ private final Plan left;
+ private final Plan right;
+
+ private JoinRewriteResult(List<Expression> newConjuncts, Plan left,
Plan right) {
+ this.newConjuncts = newConjuncts;
+ this.left = left;
+ this.right = right;
+ }
+ }
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/InferPredicates.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/InferPredicates.java
index 370af833f98..8688db6a7d1 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/InferPredicates.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/InferPredicates.java
@@ -218,6 +218,14 @@ public class InferPredicates extends
DefaultPlanRewriter<JobContext> implements
Set<Expression> predicates = new LinkedHashSet<>();
Set<Slot> planOutputs = plan.getOutputSet();
for (Expression expr : expressions) {
+ if (expr.containsVolatileExpression()) {
+ // Volatile expressions (e.g. rand(), uuid()) must not be
cloned into
+ // subtrees that did not already evaluate them. Otherwise,
callers that perform
+ // slot substitution (e.g. SetOp visitors below) would
introduce a fresh
+ // per-row evaluation of the volatile expression on a sibling
branch, changing
+ // query semantics (see EXCEPT/INTERSECT regression cases).
+ continue;
+ }
Set<Slot> slots = expr.getInputSlots();
if (!slots.isEmpty() && planOutputs.containsAll(slots)) {
predicates.add(expr);
@@ -242,6 +250,11 @@ public class InferPredicates extends
DefaultPlanRewriter<JobContext> implements
Set<Expression> predicates = new LinkedHashSet<>();
Set<Slot> planOutputs = plan.getOutputSet();
for (Expression expr : expressions) {
+ if (expr.containsVolatileExpression()) {
+ // See inferNewPredicate for rationale: never clone volatile
+ // predicates into a subtree that did not already evaluate
them.
+ continue;
+ }
Set<Slot> slots = expr.getInputSlots();
if (slots.isEmpty() || !planOutputs.containsAll(slots)) {
continue;
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/ProjectOtherJoinConditionForNestedLoopJoin.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/ProjectOtherJoinConditionForNestedLoopJoin.java
index 7d28269005c..ee3944eae05 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/ProjectOtherJoinConditionForNestedLoopJoin.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/ProjectOtherJoinConditionForNestedLoopJoin.java
@@ -114,6 +114,14 @@ public class ProjectOtherJoinConditionForNestedLoopJoin
extends OneRewriteRuleFa
if (input.isEmpty() || expression instanceof Slot) {
return expression;
}
+ // A mixed expression like `t1.a + rand() > t2.b` has
inputSlots={t1.a}; if we alias
+ // it into a child Project, rand()'s evaluation granularity
changes from "per join
+ // pair" to "per row of that child", which silently changes
results. Keep such
+ // expressions inline in otherJoinConjuncts, but still recurse to
extract deterministic
+ // child expressions.
+ if (expression.containsVolatileExpression()) {
+ return super.visit(expression, ctx);
+ }
if (ctx.leftSlots.containsAll(input)) {
Alias alias = ctx.aliasMap.computeIfAbsent(expression, o ->
new Alias(o));
ctx.leftAlias.add(alias);
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughPartitionTopN.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughPartitionTopN.java
index 312c2bdac32..5c5275730a1 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughPartitionTopN.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughPartitionTopN.java
@@ -71,7 +71,13 @@ public class PushDownFilterThroughPartitionTopN extends
OneRewriteRuleFactory {
}
for (Expression expr : filter.getConjuncts()) {
Set<Slot> exprInputSlots = expr.getInputSlots();
- if (partitionKeySlots.containsAll(exprInputSlots)) {
+ // A conjunct containing a volatile function such as
rand()/uuid()
+ // must NOT be pushed below the partition top-N. It would
filter base rows before
+ // top-N selection, replacing "top-N then random filter" with
"random filter then
+ // top-N", and the surviving rows of every partition would no
longer be the true
+ // top-N. Empty-input-slot predicates like `rand() > 0.5`
would also bypass the
+ // `containsAll` check otherwise.
+ if (!expr.containsVolatileExpression() &&
partitionKeySlots.containsAll(exprInputSlots)) {
bottomConjunctsBuilder.add(expr);
} else {
upperConjunctsBuilder.add(expr);
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java
index ed45be777a4..85d78be1aef 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java
@@ -27,6 +27,7 @@ import
org.apache.doris.nereids.trees.expressions.NamedExpression;
import org.apache.doris.nereids.trees.expressions.SlotReference;
import org.apache.doris.nereids.trees.plans.Plan;
import org.apache.doris.nereids.trees.plans.algebra.EmptyRelation;
+import org.apache.doris.nereids.trees.plans.algebra.SetOperation.Qualifier;
import org.apache.doris.nereids.trees.plans.logical.LogicalEmptyRelation;
import org.apache.doris.nereids.trees.plans.logical.LogicalFilter;
import org.apache.doris.nereids.trees.plans.logical.LogicalOneRowRelation;
@@ -44,6 +45,7 @@ import com.google.common.collect.Lists;
import java.util.ArrayList;
import java.util.HashMap;
+import java.util.LinkedHashSet;
import java.util.List;
import java.util.Map;
import java.util.Optional;
@@ -61,11 +63,50 @@ public class PushDownFilterThroughSetOperation extends
OneRewriteRuleFactory {
.when(s -> s.arity() > 0
|| (s instanceof LogicalUnion && !((LogicalUnion)
s).getConstantExprsList().isEmpty())))
.thenApply(ctx -> {
- LogicalFilter<LogicalSetOperation> filter = ctx.root;
- LogicalSetOperation setOperation = filter.child();
+ LogicalFilter<LogicalSetOperation> origFilter = ctx.root;
+ LogicalSetOperation setOperation = origFilter.child();
+
+ // Pushing a conjunct that contains a volatile expression
(rand/uuid/random_bytes/...)
+ // into each branch changes semantics for every set-op except
UNION ALL.
+ // - UNION ALL: each branch row = exactly one output row
(1:1), so evaluating
+ // rand() once per branch row still matches the
per-output-row semantic.
+ // - UNION DISTINCT / INTERSECT / EXCEPT: the set-op semantics
depend on the
+ // full branch row sets before dedup/intersect/except.
Sampling rows in each
+ // branch independently changes which rows participate (e.g.
INTERSECT becomes
+ // "half of A intersect half of B" instead of "half of (A
intersect B)").
+ boolean canPushVolatileExpr = setOperation instanceof
LogicalUnion
+ && setOperation.getQualifier() == Qualifier.ALL;
+ Set<Expression> pushableConjuncts;
+ Set<Expression> keptAboveConjuncts;
+ boolean allConjunctsPushable;
+ if (canPushVolatileExpr) {
+ pushableConjuncts = origFilter.getConjuncts();
+ keptAboveConjuncts = ImmutableSet.of();
+ allConjunctsPushable = true;
+ } else {
+ pushableConjuncts = new LinkedHashSet<>();
+ Set<Expression> kept = new LinkedHashSet<>();
+ for (Expression c : origFilter.getConjuncts()) {
+ if (c.containsVolatileExpression()) {
+ kept.add(c);
+ } else {
+ pushableConjuncts.add(c);
+ }
+ }
+ keptAboveConjuncts = kept;
+ if (pushableConjuncts.isEmpty()) {
+ return null;
+ }
+ allConjunctsPushable = false;
+ }
+ LogicalFilter<LogicalSetOperation> filter =
allConjunctsPushable
+ ? origFilter
+ : new
LogicalFilter<>(ImmutableSet.copyOf(pushableConjuncts), setOperation);
+
List<Plan> newChildren = new ArrayList<>();
List<List<SlotReference>> newRegularChildrenOutputs =
Lists.newArrayList();
CascadesContext cascadesContext = ctx.cascadesContext;
+ Plan rewritten;
if (setOperation instanceof LogicalUnion) {
List<List<NamedExpression>> constantExprs =
((LogicalUnion) setOperation).getConstantExprsList();
StatementContext statementContext = ctx.statementContext;
@@ -85,7 +126,7 @@ public class PushDownFilterThroughSetOperation extends
OneRewriteRuleFactory {
List<NamedExpression> setOutputs =
setOperation.getOutputs();
if (newChildren.isEmpty() && newConstantExprs.isEmpty()) {
- return new LogicalEmptyRelation(
+ rewritten = new LogicalEmptyRelation(
statementContext.getNextRelationId(),
setOutputs
);
} else if (newChildren.isEmpty() &&
newConstantExprs.size() == 1) {
@@ -104,27 +145,32 @@ public class PushDownFilterThroughSetOperation extends
OneRewriteRuleFactory {
}
newOneRowRelationOutput.add(oneRowRelationOutput);
}
- return new LogicalOneRowRelation(
+ rewritten = new LogicalOneRowRelation(
ctx.statementContext.getNextRelationId(),
newOneRowRelationOutput.build()
);
- }
+ } else {
+ Builder<List<SlotReference>> newChildrenOutput
+ =
ImmutableList.builderWithExpectedSize(newChildren.size());
+ for (Plan newChild : newChildren) {
+ newChildrenOutput.add((List) newChild.getOutput());
+ }
- Builder<List<SlotReference>> newChildrenOutput
- =
ImmutableList.builderWithExpectedSize(newChildren.size());
- for (Plan newChild : newChildren) {
- newChildrenOutput.add((List) newChild.getOutput());
+ rewritten = ((LogicalUnion)
setOperation).withChildrenAndConstExprsList(
+ newChildren, newRegularChildrenOutputs,
newConstantExprs);
}
-
- return ((LogicalUnion)
setOperation).withChildrenAndConstExprsList(
- newChildren, newRegularChildrenOutputs,
newConstantExprs);
+ } else {
+ addFiltersToNewChildren(setOperation, filter,
setOperation.children(),
+ setOperation.getRegularChildrenOutputs(),
+ cascadesContext, newChildren,
newRegularChildrenOutputs, null,
+ (rowIndex, columnIndex) ->
setOperation.getRegularChildOutput(rowIndex).get(columnIndex),
+ Function.identity());
+ rewritten = setOperation.withChildren(newChildren);
}
- addFiltersToNewChildren(setOperation, filter,
setOperation.children(),
- setOperation.getRegularChildrenOutputs(),
- cascadesContext, newChildren,
newRegularChildrenOutputs, null,
- (rowIndex, columnIndex) ->
setOperation.getRegularChildOutput(rowIndex).get(columnIndex),
- Function.identity());
- return setOperation.withChildren(newChildren);
+ if (keptAboveConjuncts.isEmpty()) {
+ return rewritten;
+ }
+ return new
LogicalFilter<>(ImmutableSet.copyOf(keptAboveConjuncts), rewritten);
}).toRule(RuleType.PUSH_DOWN_FILTER_THROUGH_SET_OPERATION);
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindow.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindow.java
index 678e19da169..3fc7c0b8dfa 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindow.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindow.java
@@ -86,7 +86,18 @@ public class PushDownFilterThroughWindow extends
OneRewriteRuleFactory {
}).toRule(RuleType.PUSH_DOWN_FILTER_THROUGH_WINDOW);
}
+ /**
+ * Returns whether {@code conjunct} can be pushed below a window operator
with the given
+ * common partition keys.
+ */
public static boolean canPushDown(Expression conjunct, Set<SlotReference>
commonPartitionKeys) {
- return commonPartitionKeys.containsAll(conjunct.getInputSlots());
+ // A conjunct that contains a volatile function such as rand()/uuid()
+ // must NOT be pushed below the window node. Pushing it down filters
base rows before
+ // window evaluation, which changes which rows belong to each
partition and therefore
+ // changes the value of every window function (row_number, rank, sum,
...). In addition,
+ // a predicate like `rand() > 0.5` has empty input slots, so
`containsAll(emptySet)`
+ // would otherwise wrongly return true.
+ return !conjunct.containsVolatileExpression()
+ && commonPartitionKeys.containsAll(conjunct.getInputSlots());
}
}
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherCondition.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherCondition.java
index 75c5e3085d3..994060da011 100644
---
a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherCondition.java
+++
b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherCondition.java
@@ -75,7 +75,13 @@ public class PushDownJoinOtherCondition extends
OneRewriteRuleFactory {
Set<Expression> rightConjuncts = Sets.newHashSet();
for (Expression otherConjunct : otherJoinConjuncts) {
- if
(PUSH_DOWN_LEFT_VALID_TYPE.contains(join.getJoinType())
+ // Keep volatile ON predicates in otherJoinConjuncts.
Pushing them into a
+ // child changes their evaluation granularity from per
joined row to per
+ // input row. Repeated volatile occurrences are
materialized later by
+ // AddProjectForVolatileExpression.
+ if (otherConjunct.containsVolatileExpression()) {
+ remainingOther.add(otherConjunct);
+ } else if
(PUSH_DOWN_LEFT_VALID_TYPE.contains(join.getJoinType())
&& allCoveredBy(otherConjunct,
join.left().getOutputSet())) {
leftConjuncts.add(otherConjunct);
} else if
(PUSH_DOWN_RIGHT_VALID_TYPE.contains(join.getJoinType())
diff --git
a/fe/fe-core/src/main/java/org/apache/doris/nereids/util/JoinUtils.java
b/fe/fe-core/src/main/java/org/apache/doris/nereids/util/JoinUtils.java
index 520faf7acca..27ca210ee81 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/util/JoinUtils.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/util/JoinUtils.java
@@ -122,6 +122,10 @@ public class JoinUtils {
* @return true if the equal can be used as hash join condition
*/
public boolean isHashJoinCondition(EqualPredicate equal) {
+ if (equal.containsVolatileExpression()) {
+ return false;
+ }
+
Set<ExprId> equalLeftExprIds = equal.left().getInputSlotExprIds();
if (equalLeftExprIds.isEmpty()) {
return false;
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpressionTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpressionTest.java
index 48b5bc8ad2b..ba8df372028 100644
---
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpressionTest.java
+++
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpressionTest.java
@@ -26,6 +26,8 @@ import org.apache.doris.nereids.trees.expressions.Add;
import org.apache.doris.nereids.trees.expressions.Alias;
import org.apache.doris.nereids.trees.expressions.EqualTo;
import org.apache.doris.nereids.trees.expressions.Expression;
+import org.apache.doris.nereids.trees.expressions.GreaterThanEqual;
+import org.apache.doris.nereids.trees.expressions.LessThanEqual;
import org.apache.doris.nereids.trees.expressions.NamedExpression;
import org.apache.doris.nereids.trees.expressions.SlotReference;
import org.apache.doris.nereids.trees.expressions.StatementScopeIdGenerator;
@@ -50,6 +52,7 @@ import com.google.common.collect.ImmutableList;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
+import java.util.Collections;
import java.util.List;
import java.util.Optional;
@@ -148,27 +151,157 @@ public class AddProjectForVolatileExpressionTest
implements MemoPatternMatchSupp
ImmutableList.of(studentOlapScan, scoreOlapScan),
null);
+ Plan root = PlanChecker.from(MemoTestUtils.createConnectContext(),
join)
+ .applyTopDown(new AddProjectForVolatileExpression())
+ .getPlan();
+ Assertions.assertInstanceOf(LogicalJoin.class, root);
+ LogicalJoin<?, ?> newJoin = (LogicalJoin<?, ?>) root;
+ Assertions.assertEquals(studentOlapScan, newJoin.left());
+ Assertions.assertInstanceOf(LogicalProject.class, newJoin.right());
+ LogicalProject<?> rightProject = (LogicalProject<?>) newJoin.right();
+ Assertions.assertEquals(scoreOlapScan, rightProject.child());
+ Alias alias = (Alias)
rightProject.getProjects().get(rightProject.getProjects().size() - 1);
+ Assertions.assertEquals(alias.child(), random);
+ Assertions.assertEquals(ImmutableList.of(),
newJoin.getHashJoinConjuncts());
+ Assertions.assertEquals(ImmutableList.of(new EqualTo(alias.toSlot(),
sid)), newJoin.getOtherJoinConjuncts());
+ Assertions.assertEquals(ImmutableList.of(new EqualTo(alias.toSlot(),
new DoubleLiteral(1.0))), newJoin.getMarkJoinConjuncts());
+ Assertions.assertEquals(JoinType.CROSS_JOIN, newJoin.getJoinType());
+ }
+
+ @Test
+ void testRewriteJoinProjectRepeatedVolatileToRightSide() {
+ LogicalOlapScan scoreOlapScan
+ = new
LogicalOlapScan(StatementScopeIdGenerator.newRelationId(),
PlanConstructor.score);
+ SlotReference score = (SlotReference) scoreOlapScan.getOutput().get(2);
+ Random random = new Random();
+ Add repeated = new Add(score, random);
+ LogicalJoin<?, ?> join = new LogicalJoin<Plan,
Plan>(JoinType.INNER_JOIN,
+ ImmutableList.of(),
+ ImmutableList.of(
+ new GreaterThanEqual(repeated, new DoubleLiteral(0.1)),
+ new LessThanEqual(repeated, new DoubleLiteral(0.5))),
+ new DistributeHint(DistributeType.NONE),
+ Optional.empty(),
+ studentOlapScan,
+ scoreOlapScan,
+ null);
+
+ Plan root = PlanChecker.from(MemoTestUtils.createConnectContext(),
join)
+ .applyTopDown(new AddProjectForVolatileExpression())
+ .getPlan();
+ Assertions.assertInstanceOf(LogicalJoin.class, root);
+ LogicalJoin<?, ?> newJoin = (LogicalJoin<?, ?>) root;
+ Assertions.assertEquals(studentOlapScan, newJoin.left());
+ Assertions.assertInstanceOf(LogicalProject.class, newJoin.right());
+ LogicalProject<?> rightProject = (LogicalProject<?>) newJoin.right();
+ Assertions.assertEquals(scoreOlapScan, rightProject.child());
+ Alias alias = (Alias)
rightProject.getProjects().get(rightProject.getProjects().size() - 1);
+ Assertions.assertEquals(random, alias.child());
+ Assertions.assertTrue(newJoin.getOtherJoinConjuncts().stream()
+ .allMatch(conjunct ->
conjunct.anyMatch(alias.toSlot()::equals)));
+ }
+
+ @Test
+ void testRewriteJoinProjectRepeatedVolatileToLeftSideByDefault() {
+ LogicalOlapScan scoreOlapScan
+ = new
LogicalOlapScan(StatementScopeIdGenerator.newRelationId(),
PlanConstructor.score);
+ Random random = new Random();
+ LogicalJoin<?, ?> join = new LogicalJoin<Plan,
Plan>(JoinType.INNER_JOIN,
+ ImmutableList.of(),
+ ImmutableList.of(
+ new GreaterThanEqual(random, new DoubleLiteral(0.1)),
+ new LessThanEqual(random, new DoubleLiteral(0.5))),
+ new DistributeHint(DistributeType.NONE),
+ Optional.empty(),
+ studentOlapScan,
+ scoreOlapScan,
+ null);
+
Plan root = PlanChecker.from(MemoTestUtils.createConnectContext(),
join)
.applyTopDown(new AddProjectForVolatileExpression())
.getPlan();
Assertions.assertInstanceOf(LogicalJoin.class, root);
LogicalJoin<?, ?> newJoin = (LogicalJoin<?, ?>) root;
Assertions.assertInstanceOf(LogicalProject.class, newJoin.left());
+ Assertions.assertEquals(scoreOlapScan, newJoin.right());
LogicalProject<?> leftProject = (LogicalProject<?>) newJoin.left();
Assertions.assertEquals(studentOlapScan, leftProject.child());
- Assertions.assertEquals(scoreOlapScan, newJoin.right());
Alias alias = (Alias)
leftProject.getProjects().get(leftProject.getProjects().size() - 1);
- Assertions.assertEquals(alias.child(), random);
- Assertions.assertEquals(ImmutableList.of(new EqualTo(alias.toSlot(),
sid)), newJoin.getHashJoinConjuncts());
- Assertions.assertEquals(ImmutableList.of(),
newJoin.getOtherJoinConjuncts());
- Assertions.assertEquals(ImmutableList.of(new EqualTo(alias.toSlot(),
new DoubleLiteral(1.0))), newJoin.getMarkJoinConjuncts());
- Assertions.assertEquals(JoinType.INNER_JOIN, newJoin.getJoinType());
+ Assertions.assertEquals(random, alias.child());
+ Assertions.assertTrue(newJoin.getOtherJoinConjuncts().stream()
+ .allMatch(conjunct ->
conjunct.anyMatch(alias.toSlot()::equals)));
+ }
+
+ @Test
+ void
testRewriteJoinProjectRepeatedVolatileFunctionWithRightInputToRightSide() {
+ LogicalOlapScan scoreOlapScan
+ = new
LogicalOlapScan(StatementScopeIdGenerator.newRelationId(),
PlanConstructor.score);
+ SlotReference studentId = (SlotReference)
studentOlapScan.getOutput().get(0);
+ SlotReference scoreId = (SlotReference)
scoreOlapScan.getOutput().get(0);
+ JavaUdf volatileUdf = javaUdf(FunctionVolatility.VOLATILE,
+ VolatileIdentity.newVolatileIdentity(), scoreId);
+ Add repeated = new Add(studentId, volatileUdf);
+ LogicalJoin<?, ?> join = new LogicalJoin<Plan,
Plan>(JoinType.INNER_JOIN,
+ ImmutableList.of(),
+ ImmutableList.of(
+ new GreaterThanEqual(repeated, new DoubleLiteral(0.1)),
+ new LessThanEqual(repeated, new DoubleLiteral(0.5))),
+ new DistributeHint(DistributeType.NONE),
+ Optional.empty(),
+ studentOlapScan,
+ scoreOlapScan,
+ null);
+
+ Plan root = PlanChecker.from(MemoTestUtils.createConnectContext(),
join)
+ .applyTopDown(new AddProjectForVolatileExpression())
+ .getPlan();
+ Assertions.assertInstanceOf(LogicalJoin.class, root);
+ LogicalJoin<?, ?> newJoin = (LogicalJoin<?, ?>) root;
+ Assertions.assertEquals(studentOlapScan, newJoin.left());
+ Assertions.assertInstanceOf(LogicalProject.class, newJoin.right());
+ LogicalProject<?> rightProject = (LogicalProject<?>) newJoin.right();
+ Assertions.assertEquals(scoreOlapScan, rightProject.child());
+ Alias alias = (Alias)
rightProject.getProjects().get(rightProject.getProjects().size() - 1);
+ Assertions.assertEquals(volatileUdf, alias.child());
+ Assertions.assertTrue(newJoin.getOtherJoinConjuncts().stream()
+ .allMatch(conjunct ->
conjunct.anyMatch(alias.toSlot()::equals)));
+ }
+
+ @Test
+ void testRewriteJoinSkipRepeatedVolatileFunctionWithBothSideInputs() {
+ LogicalOlapScan scoreOlapScan
+ = new
LogicalOlapScan(StatementScopeIdGenerator.newRelationId(),
PlanConstructor.score);
+ SlotReference studentId = (SlotReference)
studentOlapScan.getOutput().get(0);
+ SlotReference scoreId = (SlotReference)
scoreOlapScan.getOutput().get(0);
+ JavaUdf volatileUdf = javaUdf(FunctionVolatility.VOLATILE,
+ VolatileIdentity.newVolatileIdentity(), studentId, scoreId);
+ LogicalJoin<?, ?> join = new LogicalJoin<Plan,
Plan>(JoinType.INNER_JOIN,
+ ImmutableList.of(),
+ ImmutableList.of(
+ new GreaterThanEqual(volatileUdf, new
DoubleLiteral(0.1)),
+ new LessThanEqual(volatileUdf, new
DoubleLiteral(0.5))),
+ new DistributeHint(DistributeType.NONE),
+ Optional.empty(),
+ studentOlapScan,
+ scoreOlapScan,
+ null);
+
+ Plan root = PlanChecker.from(MemoTestUtils.createConnectContext(),
join)
+ .applyTopDown(new AddProjectForVolatileExpression())
+ .getPlan();
+ Assertions.assertEquals(join, root);
}
private JavaUdf javaUdf(FunctionVolatility volatility, VolatileIdentity
volatileIdentity) {
+ return javaUdf(volatility, volatileIdentity, new IntegerLiteral(1));
+ }
+
+ private JavaUdf javaUdf(FunctionVolatility volatility, VolatileIdentity
volatileIdentity,
+ Expression... arguments) {
return new JavaUdf("java_fn", 1, "db1",
org.apache.doris.catalog.Function.BinaryType.JAVA_UDF,
-
FunctionSignature.ret(IntegerType.INSTANCE).args(IntegerType.INSTANCE),
+ FunctionSignature.ret(IntegerType.INSTANCE).args(
+ Collections.nCopies(arguments.length,
IntegerType.INSTANCE).toArray(new IntegerType[0])),
NullableMode.ALWAYS_NULLABLE, volatility, volatileIdentity,
- null, "evaluate", null, null, "", false, 360, new
IntegerLiteral(1));
+ null, "evaluate", null, null, "", false, 360, arguments);
}
}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/FindHashConditionForJoinTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/FindHashConditionForJoinTest.java
index 15af3122189..8f0c997a9ca 100644
---
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/FindHashConditionForJoinTest.java
+++
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/FindHashConditionForJoinTest.java
@@ -24,6 +24,7 @@ import org.apache.doris.nereids.trees.expressions.Expression;
import org.apache.doris.nereids.trees.expressions.LessThan;
import org.apache.doris.nereids.trees.expressions.Or;
import org.apache.doris.nereids.trees.expressions.Slot;
+import org.apache.doris.nereids.trees.expressions.functions.scalar.Random;
import org.apache.doris.nereids.trees.expressions.literal.IntegerLiteral;
import org.apache.doris.nereids.trees.plans.DistributeType;
import org.apache.doris.nereids.trees.plans.JoinType;
@@ -116,4 +117,25 @@ class FindHashConditionForJoinTest implements
MemoPatternMatchSupported {
.when(j -> j.getJoinType().isInnerJoin())
);
}
+
+ @Test
+ void testDoNotExtractVolatileEqualPredicateAsHashCondition() {
+ Slot studentId = studentScan.getOutput().get(0);
+ Slot sid = scoreScan.getOutput().get(0);
+
+ Expression deterministicEq = new EqualTo(studentId, sid);
+ Expression volatileEq = new EqualTo(studentId, new Add(sid, new
Random()));
+
+ LogicalJoin join = new LogicalJoin<>(JoinType.INNER_JOIN, new
ArrayList<>(),
+ ImmutableList.of(deterministicEq, volatileEq),
+ new DistributeHint(DistributeType.NONE), Optional.empty(),
studentScan, scoreScan, null);
+
+ PlanChecker.from(MemoTestUtils.createConnectContext(), join)
+ .applyTopDown(new FindHashConditionForJoin())
+ .matches(
+ logicalJoin()
+ .when(j ->
j.getHashJoinConjuncts().equals(ImmutableList.of(deterministicEq)))
+ .when(j ->
j.getOtherJoinConjuncts().equals(ImmutableList.of(volatileEq)))
+ );
+ }
}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindowTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindowTest.java
index 3c367550cfe..af594ff3e38 100644
---
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindowTest.java
+++
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughWindowTest.java
@@ -21,10 +21,12 @@ import org.apache.doris.nereids.rules.RuleType;
import org.apache.doris.nereids.trees.expressions.Alias;
import org.apache.doris.nereids.trees.expressions.EqualTo;
import org.apache.doris.nereids.trees.expressions.Expression;
+import org.apache.doris.nereids.trees.expressions.GreaterThan;
import org.apache.doris.nereids.trees.expressions.NamedExpression;
import org.apache.doris.nereids.trees.expressions.StatementScopeIdGenerator;
import org.apache.doris.nereids.trees.expressions.WindowExpression;
import org.apache.doris.nereids.trees.expressions.WindowFrame;
+import org.apache.doris.nereids.trees.expressions.functions.scalar.Random;
import org.apache.doris.nereids.trees.expressions.functions.window.RowNumber;
import org.apache.doris.nereids.trees.expressions.literal.Literal;
import org.apache.doris.nereids.trees.plans.logical.LogicalOlapScan;
@@ -92,6 +94,44 @@ class PushDownFilterThroughWindowTest extends
TestWithFeService implements MemoP
);
}
+ /**
+ * A filter conjunct containing a unique (non-idempotent) function such as
rand() / uuid()
+ * must NOT be pushed below a window node, even when its input slots are a
subset of the
+ * common partition keys (note: rand()>0.5 has empty input slots, so
the legacy
+ * {@code containsAll(emptySet)} check would otherwise wrongly let it
through). Pushing
+ * such a predicate down would filter base rows before window evaluation
and therefore
+ * change which rows belong to each partition, altering every window
function's result.
+ */
+ @Test
+ void testDoNotPushUniqueFunctionThroughWindow() {
+ ConnectContext context = MemoTestUtils.createConnectContext();
+ NamedExpression age = scan.getOutput().get(3).toSlot();
+ List<Expression> partitionKeyList = ImmutableList.of(age);
+ WindowFrame windowFrame = new
WindowFrame(WindowFrame.FrameUnitsType.ROWS,
+ WindowFrame.FrameBoundary.newPrecedingBoundary(),
+ WindowFrame.FrameBoundary.newCurrentRowBoundary());
+ WindowExpression window1 = new WindowExpression(new RowNumber(),
partitionKeyList,
+ Lists.newArrayList(), windowFrame);
+ Alias windowAlias1 = new Alias(window1, window1.toSql());
+ List<NamedExpression> expressions = Lists.newArrayList(windowAlias1);
+ LogicalWindow<LogicalOlapScan> window = new
LogicalWindow<>(expressions, scan);
+ // rand() > 0 — empty input slots; would have satisfied
containsAll(empty) check.
+ Expression filterPredicate = new GreaterThan(new Random(),
Literal.of(0));
+
+ LogicalPlan plan = new LogicalPlanBuilder(window)
+ .filter(filterPredicate)
+ .build();
+ PlanChecker.from(context, plan)
+ .applyTopDown(new PushDownFilterThroughWindow())
+ .matchesFromRoot(
+ logicalFilter(
+ logicalWindow(
+ logicalOlapScan()
+ )
+ ).when(f -> f.getConjuncts().contains(filterPredicate))
+ );
+ }
+
@Test
public void testPushDownFilter() throws Exception {
String db = "test";
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherConditionTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherConditionTest.java
index 20b20f180d0..ae7e2969926 100644
---
a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherConditionTest.java
+++
b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownJoinOtherConditionTest.java
@@ -17,9 +17,11 @@
package org.apache.doris.nereids.rules.rewrite;
+import org.apache.doris.nereids.trees.expressions.Add;
import org.apache.doris.nereids.trees.expressions.Expression;
import org.apache.doris.nereids.trees.expressions.GreaterThan;
import org.apache.doris.nereids.trees.expressions.Slot;
+import org.apache.doris.nereids.trees.expressions.functions.scalar.Random;
import org.apache.doris.nereids.trees.expressions.literal.Literal;
import org.apache.doris.nereids.trees.plans.JoinType;
import org.apache.doris.nereids.trees.plans.logical.LogicalOlapScan;
@@ -250,4 +252,31 @@ class PushDownJoinOtherConditionTest implements
MemoPatternMatchSupported {
logicalOlapScan(),
logicalOlapScan()));
}
+
+ @Test
+ void doNotPushUniqueFunctionConjunct() {
+ // Volatile predicates must not be pushed to children, because that
changes
+ // their evaluation granularity from per joined row to per input row.
+ Expression deterministic = new GreaterThan(rStudentSlots.get(1),
Literal.of(18));
+ Expression uniqueFn = new GreaterThan(new Random(), Literal.of(0.5));
+ Expression sideSpecificUniqueFn = new GreaterThan(
+ new Add(rScoreSlots.get(2), new Random()), Literal.of(0.5));
+ List<Expression> condition = ImmutableList.of(deterministic, uniqueFn,
sideSpecificUniqueFn);
+
+ LogicalPlan root = new LogicalPlanBuilder(rStudent)
+ .join(rScore, JoinType.INNER_JOIN,
ExpressionUtils.EMPTY_CONDITION, condition)
+ .project(Lists.newArrayList())
+ .build();
+
+ // Deterministic conjunct is pushed to left; unique-function conjuncts
stay in join.
+ PlanChecker.from(MemoTestUtils.createConnectContext(), root)
+ .applyTopDown(new PushDownJoinOtherCondition())
+ .matches(
+ logicalJoin(
+ logicalFilter().when(
+ filter ->
filter.getConjuncts().equals(ImmutableSet.of(deterministic))),
+ logicalOlapScan())
+ .when(join ->
join.getOtherJoinConjuncts().equals(
+ ImmutableList.of(uniqueFn,
sideSpecificUniqueFn))));
+ }
}
diff --git
a/fe/fe-core/src/test/java/org/apache/doris/nereids/util/JoinUtilsTest.java
b/fe/fe-core/src/test/java/org/apache/doris/nereids/util/JoinUtilsTest.java
index c1e9c77252a..9d634d58ab9 100644
--- a/fe/fe-core/src/test/java/org/apache/doris/nereids/util/JoinUtilsTest.java
+++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/util/JoinUtilsTest.java
@@ -22,10 +22,12 @@ import org.apache.doris.catalog.ColocateTableIndex.GroupId;
import org.apache.doris.catalog.Env;
import org.apache.doris.nereids.properties.DistributionSpecHash;
import org.apache.doris.nereids.properties.DistributionSpecHash.ShuffleType;
+import org.apache.doris.nereids.trees.expressions.Add;
import org.apache.doris.nereids.trees.expressions.EqualTo;
import org.apache.doris.nereids.trees.expressions.ExprId;
import org.apache.doris.nereids.trees.expressions.Expression;
import org.apache.doris.nereids.trees.expressions.SlotReference;
+import org.apache.doris.nereids.trees.expressions.functions.scalar.Random;
import org.apache.doris.nereids.types.TinyIntType;
import org.apache.doris.qe.ConnectContext;
@@ -40,6 +42,23 @@ import java.util.List;
public class JoinUtilsTest {
+ @Test
+ public void testVolatileEqualPredicateIsNotHashCondition() {
+ SlotReference leftKey = new SlotReference(new ExprId(1), "c1",
+ TinyIntType.INSTANCE, false, Lists.newArrayList());
+ SlotReference rightKey = new SlotReference(new ExprId(2), "c2",
+ TinyIntType.INSTANCE, false, Lists.newArrayList());
+ EqualTo equalTo = new EqualTo(leftKey, new Add(rightKey, new
Random()));
+
+ JoinUtils.JoinSlotCoverageChecker checker = new
JoinUtils.JoinSlotCoverageChecker(
+ Lists.newArrayList(leftKey), Lists.newArrayList(rightKey));
+
+ Assertions.assertTrue(equalTo.containsVolatileExpression());
+ Assertions.assertFalse(checker.isHashJoinCondition(equalTo));
+ Assertions.assertTrue(JoinUtils.extractExpressionForHashTable(
+ Lists.newArrayList(leftKey), Lists.newArrayList(rightKey),
Lists.newArrayList(equalTo)).first.isEmpty());
+ }
+
@Test
public void testCouldColocateJoinForSameTable() {
ConnectContext ctx = new ConnectContext();
diff --git
a/regression-test/data/nereids_rules_p0/infer_predicate/extend_infer_equal_predicate.out
b/regression-test/data/nereids_rules_p0/infer_predicate/extend_infer_equal_predicate.out
index c646bf2e485..450cda423d8 100644
---
a/regression-test/data/nereids_rules_p0/infer_predicate/extend_infer_equal_predicate.out
+++
b/regression-test/data/nereids_rules_p0/infer_predicate/extend_infer_equal_predicate.out
@@ -206,16 +206,14 @@ PhysicalResultSink
-- !test_random_nest_predicate --
PhysicalResultSink
---hashJoin[INNER_JOIN] hashCondition=((t1.d_datetimev2 = t2.d_datetimev2))
otherCondition=()
-----filter((dayofmonth(hours_add(convert_tz(d_datetimev2, 'Asia/Shanghai',
'Europe/Paris'), cast(random(1, 10) as INT))) > 10))
-------PhysicalOlapScan[extend_infer_t1(t1)]
+--hashJoin[INNER_JOIN] hashCondition=((t1.d_datetimev2 = t2.d_datetimev2))
otherCondition=((dayofmonth(hours_add(convert_tz(d_datetimev2, 'Asia/Shanghai',
'Europe/Paris'), cast(random(1, 10) as INT))) > 10))
+----PhysicalOlapScan[extend_infer_t1(t1)]
----PhysicalOlapScan[extend_infer_t2(t2)]
-- !test_random_predicate --
PhysicalResultSink
---hashJoin[INNER_JOIN] hashCondition=((t1.a = t2.a)) otherCondition=()
-----filter((cast(a as DOUBLE) > random(10)))
-------PhysicalOlapScan[extend_infer_t3(t1)]
+--hashJoin[INNER_JOIN] hashCondition=((t1.a = t2.a)) otherCondition=((cast(a
as DOUBLE) > random(10)))
+----PhysicalOlapScan[extend_infer_t3(t1)]
----PhysicalOlapScan[extend_infer_t4(t2)]
-- !test_predicate_map --
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/add_project_for_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/add_project_for_unique_function.out
index f003acf1caa..946af4a2f66 100644
---
a/regression-test/data/nereids_rules_p0/unique_function/add_project_for_unique_function.out
+++
b/regression-test/data/nereids_rules_p0/unique_function/add_project_for_unique_function.out
@@ -180,12 +180,11 @@ PhysicalResultSink
PhysicalResultSink
--PhysicalProject[t1.id, t1.msg, t2.id, t2.msg]
----NestedLoopJoin[INNER_JOIN](((cast(id as BIGINT) + cast(id as BIGINT)) +
$_random_9_$) >= 10)(((cast(id as BIGINT) + cast(id as BIGINT)) + $_random_9_$)
<= 20)
-------PhysicalProject[cast(id as BIGINT) AS `cast(id as BIGINT)`, random(1,
100) AS `$_random_9_$`, t1.id, t1.msg]
---------filter(($_random_10_$ <= 10) and ($_random_10_$ >= 1))
-----------PhysicalProject[random(1, 100) AS `$_random_10_$`, t1.id, t1.msg]
+------PhysicalProject[$_random_9_$, cast(id as BIGINT), t1.id, t1.msg]
+--------filter(($_random_11_$ <= 10) and ($_random_11_$ >= 1))
+----------PhysicalProject[cast(id as BIGINT) AS `cast(id as BIGINT)`,
random(1, 100) AS `$_random_11_$`, random(1, 100) AS `$_random_9_$`, t1.id,
t1.msg]
------------PhysicalOlapScan[t1]
-------PhysicalProject[cast(id as BIGINT) AS `cast(id as BIGINT)`, t2.id,
t2.msg]
---------filter(((cast(id as BIGINT) * $_random_11_$) <= 200) and ((cast(id as
BIGINT) * $_random_11_$) >= 100))
-----------PhysicalProject[random(1, 100) AS `$_random_11_$`, t2.id, t2.msg]
+------PhysicalProject[cast(id as BIGINT), t2.id, t2.msg]
+--------filter(((cast(id as BIGINT) * $_random_10_$) <= 200) and ((cast(id as
BIGINT) * $_random_10_$) >= 100))
+----------PhysicalProject[cast(id as BIGINT) AS `cast(id as BIGINT)`,
random(1, 100) AS `$_random_10_$`, t2.id, t2.msg]
------------PhysicalOlapScan[t2]
-
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/infer_predicates_set_op_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/infer_predicates_set_op_with_unique_function.out
new file mode 100644
index 00000000000..2b20e4a761a
--- /dev/null
+++
b/regression-test/data/nereids_rules_p0/unique_function/infer_predicates_set_op_with_unique_function.out
@@ -0,0 +1,47 @@
+-- This file is automatically generated. You should know what you did if you
want to edit this
+-- !except_with_rand --
+PhysicalResultSink
+--PhysicalExcept
+----PhysicalProject[t1.id]
+------filter(((cast(id as DOUBLE) + random()) > 5.0))
+--------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------PhysicalOlapScan[t2]
+
+-- !intersect_with_rand --
+PhysicalResultSink
+--PhysicalIntersect
+----PhysicalProject[t1.id]
+------filter(((cast(id as DOUBLE) + random()) > 5.0))
+--------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------PhysicalOlapScan[t2]
+
+-- !except_with_uuid --
+PhysicalResultSink
+--PhysicalExcept
+----PhysicalProject[t1.id]
+------filter(((uuid_to_int(uuid()) + cast(id as LARGEINT)) > 5))
+--------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------PhysicalOlapScan[t2]
+
+-- !intersect_with_uuid --
+PhysicalResultSink
+--PhysicalIntersect
+----PhysicalProject[t1.id]
+------filter(((uuid_to_int(uuid()) + cast(id as LARGEINT)) > 5))
+--------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------PhysicalOlapScan[t2]
+
+-- !except_deterministic --
+PhysicalResultSink
+--PhysicalExcept
+----PhysicalProject[t1.id]
+------filter((t1.id > 4))
+--------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------filter((t2.id > 4))
+--------PhysicalOlapScan[t2]
+
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/project_other_join_condition_for_nlj_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/project_other_join_condition_for_nlj_with_unique_function.out
new file mode 100644
index 00000000000..66c185735b7
--- /dev/null
+++
b/regression-test/data/nereids_rules_p0/unique_function/project_other_join_condition_for_nlj_with_unique_function.out
@@ -0,0 +1,36 @@
+-- This file is automatically generated. You should know what you did if you
want to edit this
+-- !left_join_rand_one_side --
+PhysicalResultSink
+--PhysicalProject[t1.id, t2.id]
+----NestedLoopJoin[LEFT_OUTER_JOIN]((cast(id as DOUBLE) + random()) < cast(id
as DOUBLE))
+------PhysicalProject[cast(id as DOUBLE) AS `cast(id as DOUBLE)`, t1.id]
+--------PhysicalOlapScan[t1]
+------PhysicalProject[cast(id as DOUBLE) AS `cast(id as DOUBLE)`, t2.id]
+--------PhysicalOlapScan[t2]
+
+-- !cross_rand_one_side --
+PhysicalResultSink
+--PhysicalProject[t1.id, t2.id]
+----NestedLoopJoin[INNER_JOIN]((cast(id as DOUBLE) + random()) > cast(id as
DOUBLE))
+------PhysicalProject[cast(id as DOUBLE) AS `cast(id as DOUBLE)`, t1.id]
+--------PhysicalOlapScan[t1]
+------PhysicalProject[cast(id as DOUBLE) AS `cast(id as DOUBLE)`, t2.id]
+--------PhysicalOlapScan[t2]
+
+-- !cross_rand_both_sides --
+PhysicalResultSink
+--PhysicalProject[t1.id, t2.id]
+----NestedLoopJoin[INNER_JOIN]((cast(id as DOUBLE) + random()) > (cast(id as
DOUBLE) + random()))
+------PhysicalProject[cast(id as DOUBLE) AS `cast(id as DOUBLE)`, t1.id]
+--------PhysicalOlapScan[t1]
+------PhysicalProject[cast(id as DOUBLE) AS `cast(id as DOUBLE)`, t2.id]
+--------PhysicalOlapScan[t2]
+
+-- !hash_and_other_with_volatile_equal --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id))
otherCondition=((cast(id as BIGINT) = (cast(id as BIGINT) + random(1, 10))))
+----PhysicalProject[t1.id]
+------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------PhysicalOlapScan[t2]
+
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.out
index efd13676d65..c3902875ecb 100644
---
a/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.out
+++
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.out
@@ -65,3 +65,12 @@ PhysicalResultSink
--------PhysicalProject[(cast(id as BIGINT) * 5) AS `expr_(cast(id as BIGINT)
* 5)`, t3.id]
----------PhysicalOlapScan[t2(t3)]
+-- !push_down_filter_through_join_4 --
+PhysicalResultSink
+--filter(((cast(id as DOUBLE) + random()) > 0.2) and (random() > 0.1))
+----NestedLoopJoin[CROSS_JOIN]
+------PhysicalProject[t1.id]
+--------PhysicalOlapScan[t1]
+------PhysicalProject[t2.id]
+--------PhysicalOlapScan[t2]
+
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_partition_topn_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_partition_topn_with_unique_function.out
new file mode 100644
index 00000000000..e7b584cc57d
--- /dev/null
+++
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_partition_topn_with_unique_function.out
@@ -0,0 +1,17 @@
+-- This file is automatically generated. You should know what you did if you
want to edit this
+-- !filter_through_partition_topn_unique_1 --
+PhysicalResultSink
+--filter((random() > 0.5) and (v.rn <= 3))
+----PhysicalWindow
+------PhysicalQuickSort[LOCAL_SORT]
+--------PhysicalPartitionTopN
+----------PhysicalOlapScan[t1]
+
+-- !filter_through_partition_topn_unique_2 --
+PhysicalResultSink
+--filter(((cast(id as BIGINT) + random(1, 100)) > 5) and (v.rn <= 3))
+----PhysicalWindow
+------PhysicalQuickSort[LOCAL_SORT]
+--------PhysicalPartitionTopN
+----------PhysicalOlapScan[t1]
+
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_set_operation_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_set_operation_with_unique_function.out
new file mode 100644
index 00000000000..4ad588ad439
--- /dev/null
+++
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_set_operation_with_unique_function.out
@@ -0,0 +1,74 @@
+-- This file is automatically generated. You should know what you did if you
want to edit this
+-- !union_distinct_keep_rand --
+PhysicalResultSink
+--filter((random() > 0.1))
+----hashAgg[GLOBAL]
+------hashAgg[LOCAL]
+--------PhysicalUnion
+----------PhysicalProject[t1.id]
+------------PhysicalOlapScan[t1]
+----------PhysicalProject[t2.id]
+------------PhysicalOlapScan[t2]
+
+-- !intersect_keep_rand --
+PhysicalResultSink
+--filter((random() > 0.1))
+----PhysicalIntersect
+------PhysicalProject[t1.id]
+--------PhysicalOlapScan[t1]
+------PhysicalProject[t2.id]
+--------PhysicalOlapScan[t2]
+
+-- !except_keep_rand --
+PhysicalResultSink
+--filter((random() > 0.1))
+----PhysicalExcept
+------PhysicalProject[t1.id]
+--------PhysicalOlapScan[t1]
+------PhysicalProject[t2.id]
+--------PhysicalOlapScan[t2]
+
+-- !union_all_push_rand --
+PhysicalResultSink
+--PhysicalUnion
+----PhysicalProject[t1.id]
+------filter((random() > 0.1))
+--------PhysicalOlapScan[t1]
+----PhysicalProject[t2.id]
+------filter((random() > 0.1))
+--------PhysicalOlapScan[t2]
+
+-- !union_distinct_split --
+PhysicalResultSink
+--filter((random() > 0.1))
+----PhysicalLimit[GLOBAL]
+------PhysicalUnion
+--------PhysicalProject[t1.id]
+----------filter((t1.id = 1))
+------------PhysicalOlapScan[t1]
+--------PhysicalProject[t2.id]
+----------filter((t2.id = 1))
+------------PhysicalOlapScan[t2]
+
+-- !intersect_split --
+PhysicalResultSink
+--filter((random() > 0.1))
+----PhysicalIntersect
+------PhysicalProject[t1.id]
+--------filter((t1.id = 1))
+----------PhysicalOlapScan[t1]
+------PhysicalProject[t2.id]
+--------filter((t2.id = 1))
+----------PhysicalOlapScan[t2]
+
+-- !except_split --
+PhysicalResultSink
+--filter((random() > 0.1))
+----PhysicalExcept
+------PhysicalProject[t1.id]
+--------filter((t1.id = 1))
+----------PhysicalOlapScan[t1]
+------PhysicalProject[t2.id]
+--------filter((t2.id = 1))
+----------PhysicalOlapScan[t2]
+
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_window_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_window_with_unique_function.out
new file mode 100644
index 00000000000..a7083b712d7
--- /dev/null
+++
b/regression-test/data/nereids_rules_p0/unique_function/push_down_filter_through_window_with_unique_function.out
@@ -0,0 +1,23 @@
+-- This file is automatically generated. You should know what you did if you
want to edit this
+-- !filter_through_window_unique_1 --
+PhysicalResultSink
+--filter((random() > 0.5))
+----PhysicalWindow
+------PhysicalQuickSort[LOCAL_SORT]
+--------PhysicalOlapScan[t1]
+
+-- !filter_through_window_unique_2 --
+PhysicalResultSink
+--filter((random() > 0.5))
+----PhysicalWindow
+------PhysicalQuickSort[LOCAL_SORT]
+--------filter((v.id > 5))
+----------PhysicalOlapScan[t1]
+
+-- !filter_through_window_unique_3 --
+PhysicalResultSink
+--filter(((cast(id as BIGINT) + random(1, 100)) > 5))
+----PhysicalWindow
+------PhysicalQuickSort[LOCAL_SORT]
+--------PhysicalOlapScan[t1]
+
diff --git
a/regression-test/data/nereids_rules_p0/unique_function/push_down_join_other_condition_with_unique_function.out
b/regression-test/data/nereids_rules_p0/unique_function/push_down_join_other_condition_with_unique_function.out
new file mode 100644
index 00000000000..79b10152c13
--- /dev/null
+++
b/regression-test/data/nereids_rules_p0/unique_function/push_down_join_other_condition_with_unique_function.out
@@ -0,0 +1,53 @@
+-- This file is automatically generated. You should know what you did if you
want to edit this
+-- !join_other_unique_1 --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id))
otherCondition=((random() > 0.5))
+----PhysicalOlapScan[t1]
+----PhysicalOlapScan[t2]
+
+-- !join_other_unique_2 --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id))
otherCondition=((random() > 0.5))
+----filter((t1.id > 3))
+------PhysicalOlapScan[t1]
+----filter((t2.id > 3))
+------PhysicalOlapScan[t2]
+
+-- !join_other_unique_3 --
+PhysicalResultSink
+--hashJoin[LEFT_OUTER_JOIN] hashCondition=((t1.id = t2.id)) otherCondition=((
not uuid() IS NULL))
+----PhysicalOlapScan[t1]
+----PhysicalOlapScan[t2]
+
+-- !join_other_unique_4 --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id))
otherCondition=(((cast(id as DOUBLE) + random()) > 0.5))
+----PhysicalOlapScan[t1]
+----PhysicalOlapScan[t2]
+
+-- !join_other_unique_5 --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id)) otherCondition=()
+----PhysicalProject[t1.id, t1.msg]
+------filter(((cast(id as DOUBLE) + $_random_5_$) <= 0.5) and ((cast(id as
DOUBLE) + $_random_5_$) >= 0.1))
+--------PhysicalProject[random() AS `$_random_5_$`, t1.id, t1.msg]
+----------PhysicalOlapScan[t1]
+----PhysicalOlapScan[t2]
+
+-- !join_other_unique_6 --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id)) otherCondition=()
+----PhysicalOlapScan[t1]
+----PhysicalProject[t2.id, t2.msg]
+------filter(((cast(id as DOUBLE) + $_random_5_$) <= 0.5) and ((cast(id as
DOUBLE) + $_random_5_$) >= 0.1))
+--------PhysicalProject[random() AS `$_random_5_$`, t2.id, t2.msg]
+----------PhysicalOlapScan[t2]
+
+-- !join_other_unique_7 --
+PhysicalResultSink
+--hashJoin[INNER_JOIN] hashCondition=((t1.id = t2.id)) otherCondition=()
+----PhysicalProject[t1.id, t1.msg]
+------filter(($_random_5_$ <= 0.5) and ($_random_5_$ >= 0.1))
+--------PhysicalProject[random() AS `$_random_5_$`, t1.id, t1.msg]
+----------PhysicalOlapScan[t1]
+----PhysicalOlapScan[t2]
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/add_project_for_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/add_project_for_unique_function.groovy
index 78544a29608..24351dd4bdb 100644
---
a/regression-test/suites/nereids_rules_p0/unique_function/add_project_for_unique_function.groovy
+++
b/regression-test/suites/nereids_rules_p0/unique_function/add_project_for_unique_function.groovy
@@ -19,6 +19,7 @@ suite('add_project_for_unique_function') {
sql "set parallel_pipeline_task_num=2"
sql 'SET enable_nereids_planner=true'
sql 'SET runtime_filter_mode=OFF'
+ sql 'SET disable_join_reorder=true'
sql 'SET enable_fallback_to_original_planner=false'
sql "SET ignore_shape_nodes='PhysicalDistribute'"
sql "SET
detail_shape_nodes='PhysicalProject,PhysicalOneRowRelation,PhysicalUnion,PhysicalQuickSort,PhysicalHashAggregate'"
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/infer_predicates_set_op_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/infer_predicates_set_op_with_unique_function.groovy
new file mode 100644
index 00000000000..9fdda2d0a82
--- /dev/null
+++
b/regression-test/suites/nereids_rules_p0/unique_function/infer_predicates_set_op_with_unique_function.groovy
@@ -0,0 +1,66 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite('infer_predicates_set_op_with_unique_function') {
+ sql 'SET enable_nereids_planner=true'
+ sql 'SET runtime_filter_mode=OFF'
+ sql 'SET enable_fallback_to_original_planner=false'
+ sql "SET ignore_shape_nodes='PhysicalDistribute'"
+ sql "SET detail_shape_nodes='PhysicalProject'"
+ sql 'SET disable_nereids_rules=PRUNE_EMPTY_PARTITION'
+
+ // InferPredicates must NOT clone a predicate containing rand()/uuid()
+ // from the left branch of EXCEPT/INTERSECT onto the right branch through
+ // slot substitution, because it would re-evaluate the unique function on
+ // a different set of rows and change semantics.
+
+ qt_except_with_rand '''
+ explain shape plan
+ (select id from t1 where t1.id + random() > 5)
+ except
+ (select id from t2)
+ '''
+
+ qt_intersect_with_rand '''
+ explain shape plan
+ (select id from t1 where t1.id + random() > 5)
+ intersect
+ (select id from t2)
+ '''
+
+ qt_except_with_uuid '''
+ explain shape plan
+ (select id from t1 where uuid_to_int(uuid()) + t1.id > 5)
+ except
+ (select id from t2)
+ '''
+
+ qt_intersect_with_uuid '''
+ explain shape plan
+ (select id from t1 where uuid_to_int(uuid()) + t1.id > 5)
+ intersect
+ (select id from t2)
+ '''
+
+ // Deterministic predicates must still be inferred across set-op branches.
+ qt_except_deterministic '''
+ explain shape plan
+ (select id from t1 where t1.id + 1 > 5)
+ except
+ (select id from t2)
+ '''
+}
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/project_other_join_condition_for_nlj_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/project_other_join_condition_for_nlj_with_unique_function.groovy
new file mode 100644
index 00000000000..12d38936d56
--- /dev/null
+++
b/regression-test/suites/nereids_rules_p0/unique_function/project_other_join_condition_for_nlj_with_unique_function.groovy
@@ -0,0 +1,58 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite('project_other_join_condition_for_nlj_with_unique_function') {
+ sql 'SET enable_nereids_planner=true'
+ sql 'SET runtime_filter_mode=OFF'
+ sql 'SET disable_join_reorder=true'
+ sql 'SET enable_fallback_to_original_planner=false'
+ sql "SET ignore_shape_nodes='PhysicalDistribute'"
+ sql "SET detail_shape_nodes='PhysicalProject'"
+ sql 'SET disable_nereids_rules=PRUNE_EMPTY_PARTITION'
+
+ // LEFT JOIN with a mixed-slots + rand() non-equal other-condition —
+ // the `t1.id + rand()` expression must NOT be pre-aliased into a Project
+ // below NLJ, otherwise rand() is evaluated per-left-row instead of
per-pair.
+ qt_left_join_rand_one_side '''
+ explain shape plan
+ select t1.id, t2.id
+ from t1 left join t2 on t1.id + rand() < t2.id
+ '''
+
+ // INNER CROSS-style NLJ with rand() on one side expression.
+ qt_cross_rand_one_side '''
+ explain shape plan
+ select t1.id, t2.id
+ from t1 join t2 on t1.id + rand() > t2.id
+ '''
+
+ // Both sides contain rand() — both sub-expressions must stay inline in the
+ // other condition, neither should be pre-aliased into a child Project.
+ qt_cross_rand_both_sides '''
+ explain shape plan
+ select t1.id, t2.id
+ from t1 join t2 on t1.id + rand() > t2.id + rand()
+ '''
+
+ qt_hash_and_other_with_volatile_equal '''
+ explain shape plan
+ select t1.id, t2.id
+ from t1 join t2
+ on t1.id = t2.id
+ and t1.id = t2.id + random(1, 10)
+ '''
+}
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.groovy
index 1afbef840d2..9a9da6fcb7d 100644
---
a/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.groovy
+++
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_join_with_unique_function.groovy
@@ -81,4 +81,14 @@ suite('push_down_filter_through_join_with_unique_function') {
from t1, t2, t2 as t3
where random() > 10 and t1.id * 2 = t3.id * 5
'''
+
+ // Volatile predicates stay above the join even when they reference only
one
+ // side. Pushing them into a child would evaluate the volatile expression
once
+ // per input row instead of once per joined row.
+ qt_push_down_filter_through_join_4 '''
+ explain shape plan
+ select t1.id, t2.id
+ from t1 join t2
+ where rand() > 0.1 and t2.id + rand() > 0.2
+ '''
}
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_partition_topn_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_partition_topn_with_unique_function.groovy
new file mode 100644
index 00000000000..58c6e82759f
--- /dev/null
+++
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_partition_topn_with_unique_function.groovy
@@ -0,0 +1,47 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite('push_down_filter_through_partition_topn_with_unique_function') {
+ sql 'SET enable_nereids_planner=true'
+ sql 'SET runtime_filter_mode=OFF'
+ sql 'SET enable_fallback_to_original_planner=false'
+ sql "SET ignore_shape_nodes='PhysicalDistribute'"
+ sql "SET detail_shape_nodes='PhysicalProject'"
+ sql 'SET disable_nereids_rules=PRUNE_EMPTY_PARTITION'
+ // partition topn is generated by a window-then-filter (rn <= K) pattern;
enable the rule.
+ sql 'SET enable_partition_topn=true'
+
+ // rand() > 0.5 has empty input slots; old code would push it below
partition top-N
+ // because `partitionKeySlots.containsAll(emptySet) = true`. That replaces
+ // "topN then random filter" with "random filter then topN", changing topN
membership.
+ qt_filter_through_partition_topn_unique_1 '''
+ explain shape plan
+ select * from (
+ select id, msg, row_number() over (partition by id order by msg)
rn from t1
+ ) v
+ where rn <= 3 and rand() > 0.5
+ '''
+
+ // Predicate uses the partition key together with a unique function;
legacy check matched.
+ qt_filter_through_partition_topn_unique_2 '''
+ explain shape plan
+ select * from (
+ select id, msg, row_number() over (partition by id order by msg)
rn from t1
+ ) v
+ where rn <= 3 and id + rand(1, 100) > 5
+ '''
+}
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_set_operation_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_set_operation_with_unique_function.groovy
new file mode 100644
index 00000000000..c58c87e7e63
--- /dev/null
+++
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_set_operation_with_unique_function.groovy
@@ -0,0 +1,75 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite('push_down_filter_through_set_operation_with_unique_function') {
+ sql 'SET enable_nereids_planner=true'
+ sql 'SET runtime_filter_mode=OFF'
+ sql 'SET enable_fallback_to_original_planner=false'
+ sql "SET ignore_shape_nodes='PhysicalDistribute'"
+ sql "SET detail_shape_nodes='PhysicalProject'"
+ sql 'SET disable_nereids_rules=PRUNE_EMPTY_PARTITION'
+
+ // UNION DISTINCT: rand() filter must NOT be pushed into branches,
+ // otherwise membership / dedup semantics change.
+ qt_union_distinct_keep_rand '''
+ explain shape plan
+ select id from ((select id from t1) union distinct (select id from
t2)) u
+ where rand() > 0.1
+ '''
+
+ // INTERSECT: rand() filter must NOT be pushed — intersect depends on
+ // full branch membership before dedup.
+ qt_intersect_keep_rand '''
+ explain shape plan
+ select id from ((select id from t1) intersect (select id from t2)) u
+ where rand() > 0.1
+ '''
+
+ // EXCEPT: rand() filter must NOT be pushed.
+ qt_except_keep_rand '''
+ explain shape plan
+ select id from ((select id from t1) except (select id from t2)) u
+ where rand() > 0.1
+ '''
+
+ // UNION ALL: per-output-row == per-branch-input-row (1:1), push is safe.
+ // Expect rand() filter to be pushed into each branch as before.
+ qt_union_all_push_rand '''
+ explain shape plan
+ select id from ((select id from t1) union all (select id from t2)) u
+ where rand() > 0.1
+ '''
+
+ // Mixed: non-unique predicate goes down, unique-fn stays up.
+ qt_union_distinct_split '''
+ explain shape plan
+ select id from ((select id from t1) union distinct (select id from
t2)) u
+ where id = 1 and rand() > 0.1
+ '''
+
+ qt_intersect_split '''
+ explain shape plan
+ select id from ((select id from t1) intersect (select id from t2)) u
+ where id = 1 and rand() > 0.1
+ '''
+
+ qt_except_split '''
+ explain shape plan
+ select id from ((select id from t1) except (select id from t2)) u
+ where id = 1 and rand() > 0.1
+ '''
+}
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_window_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_window_with_unique_function.groovy
new file mode 100644
index 00000000000..8bd87287d05
--- /dev/null
+++
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_filter_through_window_with_unique_function.groovy
@@ -0,0 +1,56 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite('push_down_filter_through_window_with_unique_function') {
+ sql 'SET enable_nereids_planner=true'
+ sql 'SET runtime_filter_mode=OFF'
+ sql 'SET enable_fallback_to_original_planner=false'
+ sql "SET ignore_shape_nodes='PhysicalDistribute'"
+ sql "SET detail_shape_nodes='PhysicalProject'"
+ sql 'SET disable_nereids_rules=PRUNE_EMPTY_PARTITION'
+
+ // A `rand() > 0.5` predicate has empty input slots, so the legacy
+ // `commonPartitionKeys.containsAll(emptySet)` check would push it below
the window node.
+ // That changes which rows belong to each partition and breaks every
window function.
+ qt_filter_through_window_unique_1 '''
+ explain shape plan
+ select id, msg, rn from (
+ select id, msg, row_number() over (partition by id order by msg)
rn from t1
+ ) v
+ where rand() > 0.5
+ '''
+
+ // Mixed: deterministic predicate on the partition key remains pushable,
+ // unique-fn conjunct must stay above the window.
+ qt_filter_through_window_unique_2 '''
+ explain shape plan
+ select * from (
+ select id, msg, row_number() over (partition by id order by msg)
rn from t1
+ ) v
+ where id > 5 and rand() > 0.5
+ '''
+
+ // Conjunct that references the partition key together with a unique
function
+ // (input slots are subset of partition keys so the legacy check matched).
+ qt_filter_through_window_unique_3 '''
+ explain shape plan
+ select * from (
+ select id, msg, row_number() over (partition by id order by msg)
rn from t1
+ ) v
+ where id + rand(1, 100) > 5
+ '''
+}
diff --git
a/regression-test/suites/nereids_rules_p0/unique_function/push_down_join_other_condition_with_unique_function.groovy
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_join_other_condition_with_unique_function.groovy
new file mode 100644
index 00000000000..22697a3c5cd
--- /dev/null
+++
b/regression-test/suites/nereids_rules_p0/unique_function/push_down_join_other_condition_with_unique_function.groovy
@@ -0,0 +1,73 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite('push_down_join_other_condition_with_unique_function') {
+ sql 'SET enable_nereids_planner=true'
+ sql 'SET runtime_filter_mode=OFF'
+ sql 'SET disable_join_reorder=true'
+ sql 'SET enable_fallback_to_original_planner=false'
+ sql "SET ignore_shape_nodes='PhysicalDistribute'"
+ sql "SET detail_shape_nodes='PhysicalProject'"
+ sql 'SET disable_nereids_rules=PRUNE_EMPTY_PARTITION'
+
+ // A slot-free volatile ON predicate must stay in the join; otherwise it
+ // would be arbitrarily pushed into one side.
+ qt_join_other_unique_1 '''
+ explain shape plan
+ select * from t1 join t2 on t1.id = t2.id and rand() > 0.5
+ '''
+
+ // Mixed: a deterministic ON-condition referencing only the left side is
pushed;
+ // the empty-slot unique conjunct must stay in the join.
+ qt_join_other_unique_2 '''
+ explain shape plan
+ select * from t1 join t2 on t1.id = t2.id and t1.id > 3 and rand() >
0.5
+ '''
+
+ // uuid() with empty slots in ON condition should not be pushed either.
+ qt_join_other_unique_3 '''
+ explain shape plan
+ select * from t1 left join t2 on t1.id = t2.id and uuid() is not null
+ '''
+
+ // Side-specific volatile ON predicates stay in the join because pushing
them
+ // into a child changes their evaluation granularity.
+ qt_join_other_unique_4 '''
+ explain shape plan
+ select * from t1 join t2 on t1.id = t2.id and t1.id + rand() > 0.5
+ '''
+
+ // Repeated volatile expressions from BETWEEN expansion are materialized
once
+ // on the left side when they reference only left slots.
+ qt_join_other_unique_5 '''
+ explain shape plan
+ select * from t1 join t2 on t1.id = t2.id and t1.id + rand() between
0.1 and 0.5
+ '''
+
+ // Repeated volatile expressions that reference only right slots are
materialized
+ // on the right side.
+ qt_join_other_unique_6 '''
+ explain shape plan
+ select * from t1 join t2 on t1.id = t2.id and t2.id + rand() between
0.1 and 0.5
+ '''
+
+ // Slot-free repeated volatile expressions default to the left side.
+ qt_join_other_unique_7 '''
+ explain shape plan
+ select * from t1 join t2 on t1.id = t2.id and rand() between 0.1 and
0.5
+ '''
+}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]