EeshanBembi commented on code in PR #16905:
URL: https://github.com/apache/datafusion/pull/16905#discussion_r2233575833
##########
datafusion/core/src/physical_planner.rs:
##########
@@ -886,6 +887,26 @@ impl DefaultPhysicalPlanner {
"SortExec requires at least one sort expression"
);
};
+
+ // Check if we can use PartialSortExec instead of SortExec
Review Comment:
EnforceSorting can't handle this optimization effectively due to
architectural constraints.
The existing EnforceSorting rule is designed for local optimizations, it
sees each SortExec node and its immediate children in isolation. This prefix
optimization requires global coordination across multiple branches,
particularly when dealing with UnionExec scenarios where we need consistent
PartialSortExec behavior across siblings.
To implement this in EnforceSorting, we'd need to:
- Add multi-pass analysis (collect context, then transform)
- Extend PhysicalOptimizerRule to carry parent/sibling metadata
- Ensure our rule runs after all other sort transformations (brittle
ordering)
This would essentially recreate planner logic inside the optimizer
framework, breaking the clean separation between "build the tree with global
context" (planner) and "apply local rewrites" (optimizer).
The planner already has the full subtree context needed to make coordinated
decisions across branches, making it the natural place for this optimization.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]