alamb commented on code in PR #6671:
URL: https://github.com/apache/arrow-rs/pull/6671#discussion_r1826542917
##########
arrow-string/src/length.rs:
##########
@@ -137,6 +137,26 @@ pub fn bit_length(array: &dyn Array) -> Result<ArrayRef,
ArrowError> {
let list = array.as_string::<i64>();
Ok(bit_length_impl::<Int64Type>(list.offsets(), list.nulls()))
}
+ DataType::Utf8View => {
+ let string_view_array = array
+ .as_any()
+ .downcast_ref::<StringViewArray>()
+ .ok_or_else(|| ArrowError::ComputeError("Expected Utf8View
array".to_string()))?;
+ let mut bit_lengths = Vec::with_capacity(array.len());
+ for i in 0..array.len() {
+ let bit_length = if string_view_array.is_valid(i) {
+ (string_view_array.value(i).len() * 8) as i32
Review Comment:
This code could be made significantly faster by just checking the lengths in
the views rather than creating the string length.
you can get the views like
https://docs.rs/arrow/latest/arrow/array/type.StringViewArray.html#method.views
The layout is described here:
https://docs.rs/arrow/latest/arrow/array/struct.GenericByteViewArray.html#layout-views-and-buffers
Something like
```rust
let lengths = string_view_array.views()
.iter()
.map(|view| view as u32)
```
##########
arrow-string/src/length.rs:
##########
@@ -137,6 +137,26 @@ pub fn bit_length(array: &dyn Array) -> Result<ArrayRef,
ArrowError> {
let list = array.as_string::<i64>();
Ok(bit_length_impl::<Int64Type>(list.offsets(), list.nulls()))
}
+ DataType::Utf8View => {
+ let string_view_array = array
+ .as_any()
+ .downcast_ref::<StringViewArray>()
+ .ok_or_else(|| ArrowError::ComputeError("Expected Utf8View
array".to_string()))?;
Review Comment:
I think it would be more consistent with the rest of the codebase to use
`array.as_string_view()` here rather than downcast_ref.
I think this is correct, but would recommend changing the code to be
consistent
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]