BiteTheDDDDt opened a new pull request, #63529:
URL: https://github.com/apache/doris/pull/63529
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: When `experimental_use_serial_exchange` is enabled and
`enable_local_exchange_before_agg` is disabled, a serial exchange source can be
followed by a passthrough local exchange before a non-finalizing merge
aggregation. For DISTINCT aggregation, that breaks the hash distribution
required by the merge aggregation that deduplicates distinct keys. The
duplicated keys can then be processed by different local tasks, and later
partial sums can produce incorrect results.
This PR preserves hash local shuffle for merge aggregation with a serial
child when the aggregation has partition expressions. It also computes the
merge flag during `AggSinkOperatorX::init()` because local exchange planning
runs before `prepare()`.
### Release note
Fix occasional incorrect DISTINCT aggregate results when serial exchange is
enabled.
### Check List (For Author)
- Test: Manual test
- `build-support/clang-format.sh`
- `build-support/check-format.sh`
- `./run-regression-test.sh --run -d nereids_syntax_p0 -s agg_4_phase`
(reproduced failure on old BE before patched binary update)
- Behavior changed: Yes. Serial-child merge aggregation now preserves hash
local shuffle when required for correctness.
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]