asubiotto commented on code in PR #7131:
URL: https://github.com/apache/arrow-rs/pull/7131#discussion_r1958191061
##########
arrow-ord/src/partition.rs:
##########
@@ -156,7 +157,14 @@ fn find_boundaries(v: &dyn Array) -> Result<BooleanBuffer,
ArrowError> {
let slice_len = v.len() - 1;
let v1 = v.slice(0, slice_len);
let v2 = v.slice(1, slice_len);
- Ok(distinct(&v1, &v2)?.values().clone())
+
+ if !v.data_type().is_nested() {
+ return Ok(distinct(&v1, &v2)?.values().clone());
+ }
+ // Given that we're only comparing values, null ordering in the input or
Review Comment:
Do you mean for non-nested types? `eq` doesn't support nested types
similarly to `distinct` and given they both shell out to `compare_op` I don't
think there should be much of a perf difference between `distinct` and `eq` +
mapping nulls to booleans (which would be necessary).
##########
arrow-ord/src/partition.rs:
##########
@@ -298,4 +306,23 @@ mod tests {
vec![(0..1), (1..2), (2..4), (4..5), (5..7), (7..8), (8..9)],
);
}
+
+ #[test]
+ fn test_partition_nested() {
Review Comment:
Done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]