mrhhsg commented on code in PR #64293:
URL: https://github.com/apache/doris/pull/64293#discussion_r3385676003
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/executor/Rewriter.java:
##########
@@ -501,6 +502,7 @@ public class Rewriter extends AbstractBatchJobExecutor {
bottomUp(
// The later rule CHECK_PRIVILEGES which inherent from
ColumnPruning only works
// if the aggregation node is normalized, so we need
call NormalizeAggregate here
+ new CountLiteralRewrite(),
Review Comment:
Good point. I removed the extra pre-`NormalizeAggregate` placements from
`Rewriter` and kept the minimal placement only in `Analyzer` before the initial
`NormalizeAggregate`. `Rewriter` now keeps its original post-normalize
`CountLiteralRewrite` path.
The only pre-normalize run needed for this PR is before the aggregate
children are normalized into slots; the additional `Rewriter` placements were
unnecessary for the minimal scope.
Validated with:
- `./build.sh --fe`
- `./run-fe-ut.sh --run
org.apache.doris.nereids.rules.rewrite.CountLiteralRewriteTest`
- `./run-regression-test.sh --conf
output/local-regression/regression-conf-46001.groovy --run -d
nereids_rules_p0/count_constant_rewrite -s count_constant_rewrite`
- `./run-regression-test.sh --conf
output/local-regression/regression-conf-46001.groovy --run -d statistics`
- `git diff --check`
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/CountLiteralRewrite.java:
##########
@@ -107,10 +114,29 @@ private boolean isCountLiteral(AggregateFunction aggFunc)
{
&& aggFunc.child(0).isLiteral();
}
+ private boolean isCountConstantExpression(AggregateFunction aggFunc) {
+ if (aggFunc.isDistinct()
+ || !(aggFunc instanceof Count)
+ || aggFunc.children().size() != 1) {
+ return false;
+ }
Review Comment:
Addressed by excluding `Sleep` from the row-independent count rewrite
(`!arg.containsType(Sleep.class)`). I also added FE unit and regression
coverage to ensure `count(sleep(1))` is not rewritten to the new upper-project
shape.
Validated with:
- `./build.sh --fe`
- `./run-fe-ut.sh --run
org.apache.doris.nereids.rules.rewrite.CountLiteralRewriteTest`
- `./run-regression-test.sh --conf
output/local-regression/regression-conf-46001.groovy --run -d
nereids_rules_p0/count_constant_rewrite -s count_constant_rewrite`
- `git diff --check`
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateEmptyRelation.java:
##########
@@ -73,6 +81,9 @@ public List<Rule> buildRules() {
ConnectContext.get().getStatementContext().getNextRelationId(),
agg.getOutput())
).toRule(RuleType.ELIMINATE_AGG_ON_EMPTYRELATION),
Review Comment:
This obsolete compensation path was removed. The PR no longer relies on
`EliminateEmptyRelation` to hide the count argument for empty input.
The current implementation documents the accepted trade-off: a deterministic
row-independent count argument may be evaluated once above aggregation.
Avoiding that by injecting an aggregate dependency into scalar subtrees broke
functions with required constant arguments such as `date_trunc`, so this PR
keeps the minimal rewrite and adds regression coverage for empty/runtime-empty
value semantics with valid expressions.
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateEmptyRelation.java:
##########
@@ -230,6 +241,48 @@ private boolean bothChildrenEmpty(LogicalJoin<?, ?> join) {
return join.left() instanceof EmptyRelation && join.right() instanceof
EmptyRelation;
}
+ private Plan
rewriteCountLiteralOnEmptyAggregate(LogicalProject<LogicalAggregate<LogicalEmptyRelation>>
project) {
+ LogicalAggregate<LogicalEmptyRelation> aggregate = project.child();
+ if (!aggregate.getGroupByExpressions().isEmpty()) {
+ return null;
+ }
+
+ Set<ExprId> countStarSlots = new HashSet<>();
+ for (NamedExpression output : aggregate.getOutputExpressions()) {
+ if (output.anyMatch(expr -> expr instanceof Count && ((Count)
expr).isCountStar())) {
+ countStarSlots.add(output.toSlot().getExprId());
+ }
+ }
+ if (countStarSlots.isEmpty()) {
+ return null;
+ }
+
+ boolean changed = false;
+ List<NamedExpression> newProjects = new
ArrayList<>(project.getProjects().size());
Review Comment:
Addressed by removing the `EliminateEmptyRelation` compensation entirely.
There is no longer a broad pattern that rewrites user-authored
`IF(json_extract(...) IS NULL, 0, count(*))` expressions, and nested
expressions are handled by the normal projected rewrite shape.
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/CountLiteralRewrite.java:
##########
@@ -107,10 +118,71 @@ private boolean isCountLiteral(AggregateFunction aggFunc)
{
&& aggFunc.child(0).isLiteral();
}
+ private boolean isCountConstantExpression(AggregateFunction aggFunc) {
+ if (aggFunc.isDistinct()
+ || !(aggFunc instanceof Count)
+ || aggFunc.children().size() != 1) {
+ return false;
+ }
+ Expression arg = aggFunc.child(0);
+ return !arg.isLiteral()
+ && arg.foldable()
+ && !arg.containsNondeterministic()
+ && !arg.containsVolatileExpression()
+ && arg.getInputSlots().isEmpty()
+ && !arg.containsType(Sleep.class)
+ && !arg.containsType(AggregateFunction.class,
SubqueryExpr.class,
+ TableGeneratingFunction.class, WindowExpression.class);
+ }
+
private Expression rewrite(Count count) {
if (count.child(0).isNullLiteral()) {
return new BigIntLiteral(0);
}
+ if (!count.child(0).isLiteral()) {
+ Expression countStar = new Count();
+ Expression constExpr = deferConstantEvaluation(count.child(0));
+ Expression ifConstantExprIsNull = new If(new IsNull(constExpr),
new BigIntLiteral(0), countStar);
+ return new If(new EqualTo(countStar, new BigIntLiteral(0)), new
BigIntLiteral(0), ifConstantExprIsNull);
+ }
return new Count();
}
+
+ /*
+ * count(const_expr) should not evaluate const_expr when the aggregate
input is empty.
+ * The outer count(*) = 0 guard handles that at execution time, but BE
opens constant
+ * expression trees eagerly. Add an unreachable count(*) dependency inside
each constant
+ * subtree so that it is no longer treated as a fragment-local constant
before the guard
+ * can short-circuit it.
+ */
+ private Expression deferConstantEvaluation(Expression expression) {
+ if (expression.isLiteral()) {
+ return expression;
+ }
+ List<Expression> children = expression.children();
+ List<Expression> newChildren =
Lists.newArrayListWithCapacity(children.size());
+ boolean changed = false;
+ for (Expression child : children) {
+ Expression newChild = deferConstantEvaluation(child);
+ newChildren.add(newChild);
+ changed |= newChild != child;
+ }
+
+ Expression rewritten = changed ? expression.withChildren(newChildren)
: expression;
Review Comment:
Addressed by removing the anti-folding wrapper. The current rewrite keeps
the original scalar subtree unchanged above aggregation, so required constant
arguments such as the `date_trunc` time unit remain constant. Regression
coverage for `count(date_trunc('day', date '2024-01-02'))` was added.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]