gene-bordegaray commented on code in PR #19304:
URL: https://github.com/apache/datafusion/pull/19304#discussion_r2619440154
##########
datafusion/physical-optimizer/src/enforce_distribution.rs:
##########
@@ -889,32 +889,41 @@ fn add_roundrobin_on_top(
/// * `hash_exprs`: Stores Physical Exprs that are used during hashing.
/// * `n_target`: desired target partition number, if partition number of the
/// current executor is less than this value. Partition number will be
increased.
+/// * `allow_subset`: Whether to allow subset partitioning logic in
satisfaction checks.
+/// Set to `false` for partitioned hash joins to ensure exact hash matching.
///
/// # Returns
///
/// A [`Result`] object that contains new execution plan where the desired
/// distribution is satisfied by adding a Hash repartition.
fn add_hash_on_top(
input: DistributionContext,
- hash_exprs: Vec<Arc<dyn PhysicalExpr>>,
+ hash_exprs: &[Arc<dyn PhysicalExpr>],
n_target: usize,
+ allow_subset: bool,
Review Comment:
this makes sense. thanks for all the design tips, they are helping me learn
😄
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]