Re: [PR] Support co-partitioned range inner equi joins [datafusion]

via GitHub Mon, 29 Jun 2026 04:12:44 -0700


gene-bordegaray commented on PR #23184:
URL: https://github.com/apache/datafusion/pull/23184#issuecomment-4831832706


   > If we can instead make this an explicit requirement at the `ExecutionPlan` 
level, the implementation is likely to become simpler and easier to understand.
   > 
   > ```rust
   > // Before
   >     fn required_input_distribution(&self) -> Vec<Distribution> {
   >         vec![Distribution::UnspecifiedDistribution; self.children().len()]
   >     }
   > ```
   > 
   > ```rust
   > // After
   > pub struct RequiredInputDistributions {
   >     pub per_child: Vec<Distribution>,
   >     pub cross_child: Vec<CrossChildDistribution>,
   > }
   > 
   > trait ExecutionPlan {
   >     fn required_input_distributions(&self) -> RequiredInputDistributions {
   >         ...
   >     }
   > }
   > ```
   > 
   > The tradeoff is that this would require a large refactor upfront. The 
should not be a hard requirement for now, and I'm also unsure how to carry this 
out incrementally, but I would still love to see us move towards this direction 
sooner.
   
   @2010YOUY01 
   
   This is an intersting idea, thank you. Let me cherry pick my aggregation 
commit to be stacked on main and can open something up for a unary operator 
first. I think something like this would be worth exploring 👍 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Support co-partitioned range inner equi joins [datafusion]

Reply via email to