alamb commented on issue #1359: URL: https://github.com/apache/datafusion-ballista/issues/1359#issuecomment-3725511072
> Implement adaptive query planner which would address some of the issues with the current approach. In the core of the design are pluggable physical optimizer rules which would re-optimize plan after each stage. This is my understand of current best practice with distributed engines -- that is be able to adaptively re-organize after the plan has started. Given that Ballista already breaks the execution into stages where the intermediate results are persisted, I think this is > Rename ExecutionGraph to StaticExecutionGraph introduce dyn ExecutionGraph which would be implemented by StaticExecutionGraph and new AdaptiveExecutionGraph. Change implementation in TaskManger based on predefined configuration parameter. This approach looks like easier route to take. It would enable us to have two implementation running in parallel until AQE implementation matures. > Adaptive planner will need some time to get it right. It will work with current distributed planner for foreseeable future, will be disabled by default with configuration option to turn it on. This also makes sense to me as it will allow a phased rollout of the new adaptive execution rather than having to get it all right at once. Eventually you might be able to use the dynamic planner by default, and detect patterns in the plan that it can't handle and fall back to the static planner Maybe @mattcuento or @killzoner have some additional thoughts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
