foxtail463 opened a new pull request, #64892: URL: https://github.com/apache/doris/pull/64892
Problem Summary: PhysicalHashAggregate could still enumerate a parent hash key that is only a strict subset of the group by keys when child statistics were missing or unknown. That allowed CBO to choose a narrower shuffle distribution without evidence that the parent key had enough NDV, which can concentrate data and lead to OOM. Solution: Require agg_shuffle_use_parent_key to pass a real stats gate before adding the parent subset distribution: child stats and parent key stats must be known, and the estimated parent-key group count must be greater than LOW_NDV_THRESHOLD. Keep the full group key distribution as the conservative fallback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
