alamb commented on code in PR #15348:
URL: https://github.com/apache/datafusion/pull/15348#discussion_r2008785345
##########
datafusion/physical-plan/src/sorts/cursor.rs:
##########
@@ -281,6 +281,33 @@ impl<T: ByteArrayType> CursorArray for GenericByteArray<T>
{
}
}
+impl CursorArray for StringViewArray {
+ type Values = StringViewArray;
+ fn values(&self) -> Self {
+ self.clone()
+ }
+}
+
+impl CursorValues for StringViewArray {
+ fn len(&self) -> usize {
+ self.views().len()
+ }
+
+ fn eq(l: &Self, l_idx: usize, r: &Self, r_idx: usize) -> bool {
+ unsafe { GenericByteViewArray::compare_unchecked(l, l_idx, r,
r_idx).is_eq() }
Review Comment:
I agree it would be good to justify the use of unchecked (which I think is
ok here)
The docs say
https://docs.rs/arrow/latest/arrow/array/struct.GenericByteViewArray.html#method.compare_unchecked
SO maybe the safety argument is mostly "The left/right_idx must within range
of each array"
It also seems like we need to be comparing the Null masks too 🤔 like
checking if the values are null before comparing
Given that this comparison is typically *the* hottest part of a merge
operation maybe we should try using unchecked comparisions elswhere
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]