Jefffrey commented on code in PR #18500:
URL: https://github.com/apache/datafusion/pull/18500#discussion_r2498560850
##########
datafusion/functions-nested/src/reverse.rs:
##########
@@ -261,41 +273,42 @@ fn fixed_size_array_reverse(
field: &FieldRef,
) -> Result<ArrayRef> {
let values = array.values();
- let original_data = values.to_data();
- let capacity = Capacities::Array(original_data.len());
- let mut mutable =
- MutableArrayData::with_capacities(vec![&original_data], false,
capacity);
let value_length = array.value_length() as usize;
+ let mut indices: Vec<u64> = Vec::with_capacity(values.len());
Review Comment:
I do wonder for FSL's if we can take advantage of knowing its a fixed size
list to do it in a more efficient way? For example we know a FSL like so:
```
FSL with fixed length 3:
[[a, b, c], [d, e, f]]
```
Would always have reverse offsets like so:
```
[2, 1, 0, 5, 4, 3]
```
Without even needing to iterate the FSL array itself 🤔
##########
datafusion/functions-nested/src/reverse.rs:
##########
@@ -171,18 +168,34 @@ fn general_array_reverse<O: OffsetSizeTrait>(
let mut index = end - O::one();
while index >= start {
- mutable.extend(0, index.to_usize().unwrap(),
index.to_usize().unwrap() + 1);
+ indices.push(index);
index = index - O::one();
}
let size = end - start;
offsets.push(offsets[row_index] + size);
}
- let data = mutable.freeze();
+ // Materialize values from underlying array with take
+ let indices_array: ArrayRef = if O::IS_LARGE {
+ Arc::new(UInt64Array::from(
+ indices
+ .iter()
+ .map(|i| i.as_usize() as u64)
+ .collect::<Vec<_>>(),
+ ))
+ } else {
+ Arc::new(UInt32Array::from(
+ indices
+ .iter()
+ .map(|i| i.as_usize() as u32)
+ .collect::<Vec<_>>(),
+ ))
+ };
Review Comment:
It would be nice if we figured out a nicer way of doing this but I still
can't figure it out 😅
Went a bit crazy with generics in attempting once but didn't pan out; but it
works and the if branch isn't a big deal, though I wonder if it's more
efficient to just have Int32/Int64 arrays instead of their unsigned variants to
avoid needing a map -> collect 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]