BiteTheDDDDt opened a new pull request, #63529:
URL: https://github.com/apache/doris/pull/63529

   ### What problem does this PR solve?
   
   Issue Number: None
   
   Related PR: None
   
   Problem Summary: When `experimental_use_serial_exchange` is enabled and 
`enable_local_exchange_before_agg` is disabled, a serial exchange source can be 
followed by a passthrough local exchange before a non-finalizing merge 
aggregation. For DISTINCT aggregation, that breaks the hash distribution 
required by the merge aggregation that deduplicates distinct keys. The 
duplicated keys can then be processed by different local tasks, and later 
partial sums can produce incorrect results.
   
   This PR preserves hash local shuffle for merge aggregation with a serial 
child when the aggregation has partition expressions. It also computes the 
merge flag during `AggSinkOperatorX::init()` because local exchange planning 
runs before `prepare()`.
   
   ### Release note
   
   Fix occasional incorrect DISTINCT aggregate results when serial exchange is 
enabled.
   
   ### Check List (For Author)
   
   - Test: Manual test
       - `build-support/clang-format.sh`
       - `build-support/check-format.sh`
       - `./run-regression-test.sh --run -d nereids_syntax_p0 -s agg_4_phase` 
(reproduced failure on old BE before patched binary update)
   - Behavior changed: Yes. Serial-child merge aggregation now preserves hash 
local shuffle when required for correctness.
   - Does this need documentation: No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to