Copilot commented on code in PR #610:
URL: https://github.com/apache/sedona-db/pull/610#discussion_r2804374055
##########
rust/sedona-spatial-join/src/planner/physical_planner.rs:
##########
@@ -264,3 +283,47 @@ fn logical_join_filter_to_physical(
let join_filter = JoinFilter::new(filter_expr, column_indices,
Arc::new(filter_schema));
Ok(join_filter)
}
+
+/// Repartition the probe side of a spatial join using `RoundRobinBatch`
partitioning.
+///
+/// The purpose is to break spatial locality in sorted or skewed datasets,
which can cause
+/// imbalanced partitions when running out-of-core spatial join. The number of
partitions is
+/// preserved; only the distribution of rows across partitions is shuffled.
+///
+/// The `should_swap` parameter indicates whether `swap_inputs()` will be
called after
+/// `SpatialJoinExec` is constructed. This affects which pre-swap input will
become the
+/// probe side:
+/// - For non-KNN predicates: probe is always `Right` after any swap. If
`should_swap` is true,
+/// the current `left` will become `right` (probe) after swap, so we
repartition `left`.
+/// - For KNN predicates: `should_swap` is always false, and the probe side is
determined by
+/// `KNNPredicate::probe_side`.
+fn repartition_probe_side(
+ mut physical_left: Arc<dyn ExecutionPlan>,
+ mut physical_right: Arc<dyn ExecutionPlan>,
+ spatial_predicate: &SpatialPredicate,
+ should_swap: bool,
+) -> Result<(Arc<dyn ExecutionPlan>, Arc<dyn ExecutionPlan>)> {
+ let probe_plan = match spatial_predicate {
+ SpatialPredicate::KNearestNeighbors(knn) => match knn.probe_side {
+ JoinSide::Left => &mut physical_left,
+ _ => &mut physical_right,
Review Comment:
The wildcard pattern `_` in the match arm matches both `JoinSide::Right` and
`JoinSide::None`. However, according to the `KNNPredicate` struct documentation
and the assertion in its constructor (line 312 in spatial_predicate.rs),
`probe_side` should never be `JoinSide::None` for KNN predicates. Consider
using an explicit `JoinSide::Right` pattern here instead of `_` to make the
expected values clearer and to catch potential invalid states earlier.
Alternatively, add a comment explaining why the wildcard is acceptable.
```suggestion
JoinSide::Right => &mut physical_right,
JoinSide::None => {
// KNNPredicate::probe_side is asserted not to be None in
its constructor;
// treat this as a debug-only invariant violation and
default to right.
debug_assert!(
false,
"KNNPredicate::probe_side must not be JoinSide::None"
);
&mut physical_right
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]