mrhhsg opened a new pull request, #64293:
URL: https://github.com/apache/doris/pull/64293
### What problem does this PR solve?
Issue Number: None
Problem Summary:
`count(expr)` currently still has to evaluate a row-independent expression
for each input row when the expression is not represented as a literal, for
example `count(json_extract('<constant json>', '$.path'))`. For a
row-independent expression, the aggregate result only depends on whether that
expression is NULL: it is either `0` or equivalent to `count(*)`.
This patch adds a Nereids rewrite for non-distinct `count(<row-independent
expression>)`. The rule rewrites it to an `if(is_null(expr), 0, count(*))`
style expression when the input is foldable, deterministic, slot-free,
non-literal, and does not contain nested aggregate, subquery, table-generating,
or window expressions. This allows the planner to use row-count aggregation
paths instead of evaluating the JSON expression for every row.
The patch also adds FE unit coverage and a regression test covering non-NULL
constants, NULL constants, grouped aggregation, the motivating multi-path JSON
expression, and a column-dependent negative case.
### Release note
None
### Check List (For Author)
- Test:
- `env -u HTTP_PROXY -u HTTPS_PROXY -u ALL_PROXY -u http_proxy -u
https_proxy -u all_proxy ./run-fe-ut.sh --run
org.apache.doris.nereids.rules.rewrite.CountLiteralRewriteTest`
- `env -u HTTP_PROXY -u HTTPS_PROXY -u ALL_PROXY -u http_proxy -u
https_proxy -u all_proxy
/mnt/disk7/hushenggang/.codex/skills/doris-local-regression/scripts/doris-local-regression.sh
--network 10.26.20.3/24 all -d nereids_rules_p0/count_constant_rewrite -s
count_constant_rewrite`
- `env -u HTTP_PROXY -u HTTPS_PROXY -u ALL_PROXY -u http_proxy -u
https_proxy -u all_proxy
/mnt/disk7/hushenggang/.codex/skills/doris-local-regression/scripts/doris-local-regression.sh
--network 10.26.20.3/24 start && env -u HTTP_PROXY -u HTTPS_PROXY -u ALL_PROXY
-u http_proxy -u https_proxy -u all_proxy
/mnt/disk7/hushenggang/.codex/skills/doris-local-regression/scripts/doris-local-regression.sh
--network 10.26.20.3/24 run -d nereids_rules_p0/count_constant_rewrite -s
count_constant_rewrite`
- `git diff --check`
- `git diff --check origin/master...HEAD`
- Behavior changed: Yes. Nereids can rewrite `count()` over a deterministic
row-independent nonliteral expression to a row-count based plan guarded by the
expression nullability.
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]