LiaCastaneda commented on issue #18595: URL: https://github.com/apache/datafusion/issues/18595#issuecomment-3528334586
yeah this looks like a tricky problem 😿 , TLDR the issue is that here the Aggregate node has `FinalPartitioned` mode which enforces `HashPartitioned` mode which always enforces `RepartitionExec` right? So the two options: - If you do it on the optimizer rule you would already have commited on using `FinalPartitioned` on `AggregateExec` and unless you traverse the plan tree you wont be able to change it, you can skip repartitionning still, but it might break correctness assumptions elsewhere as Nga says. You will nned to check its backwards compatible. - Doing it in the physical planner implies you will be able to decide the mode upfront before committing the plan and since the physical plan is created bottom-up child stats are available at decision time, the downside of this I guess is that you will need to fetch the stats in the planner which adds complexity & i'm not sure stats are used during physcial planner as of now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
