gene-bordegaray commented on PR #23184:
URL: https://github.com/apache/datafusion/pull/23184#issuecomment-4831832706
> If we can instead make this an explicit requirement at the `ExecutionPlan`
level, the implementation is likely to become simpler and easier to understand.
>
> ```rust
> // Before
> fn required_input_distribution(&self) -> Vec<Distribution> {
> vec![Distribution::UnspecifiedDistribution; self.children().len()]
> }
> ```
>
> ```rust
> // After
> pub struct RequiredInputDistributions {
> pub per_child: Vec<Distribution>,
> pub cross_child: Vec<CrossChildDistribution>,
> }
>
> trait ExecutionPlan {
> fn required_input_distributions(&self) -> RequiredInputDistributions {
> ...
> }
> }
> ```
>
> The tradeoff is that this would require a large refactor upfront. The
should not be a hard requirement for now, and I'm also unsure how to carry this
out incrementally, but I would still love to see us move towards this direction
sooner.
@2010YOUY01
This is an intersting idea, thank you. Let me cherry pick my aggregation
commit to be stacked on main and can open something up for a unary operator
first. I think something like this would be worth exploring 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]