alamb commented on code in PR #5798:
URL: https://github.com/apache/arrow-rs/pull/5798#discussion_r1613556901
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -480,6 +480,19 @@ pub use crate::types::ArrowPrimitiveType;
/// assert_eq!(array.values(), &[1, 0, 2]);
/// assert!(array.is_null(1));
/// ```
+///
+/// # Example: Get a `PrimitiveArray` from an [`ArrayRef`]
+/// ```
+/// # use std::sync::Arc;
+/// # use arrow_array::{Array, ArrayRef, Float32Array, PrimitiveArray};
+/// # use arrow_array::types::{Float32Type};
+/// # use arrow_schema::DataType;
+/// # let array: ArrayRef = Arc::new(Float32Array::from(vec![1.2, 2.3]));
+/// // will panic if the array is not a Float32Array
+/// assert_eq!(&DataType::Float32, array.data_type());
+/// let f32_array = PrimitiveArray::<Float32Type>::from(array.into_data());
Review Comment:
I have no idea to be honest -- if downcasting / cloning / dropping is faster
I will update the docs to do that.
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -766,23 +787,43 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
PrimitiveArray::new(buffer.into(), nulls)
}
- /// Applies an unary and infallible function to a mutable primitive array.
- /// Mutable primitive array means that the buffer is not shared with other
arrays.
- /// As a result, this mutates the buffer directly without allocating new
buffer.
+ /// Applies a unary and infallible function to the array in place if
possible.
+ ///
+ /// # Buffer Reuse
+ ///
+ /// If the underlying buffers are not shared with other arrays, mutates
the
+ /// underlying buffer in place, without allocating a new buffer.
+ ///
+ /// # Null Handling
///
- /// # Implementation
+ /// See [`Self::unary`] for more information on null handling.
///
- /// This will apply the function for all values, including those on null
slots.
- /// This implies that the operation must be infallible for any value of
the corresponding type
- /// or this function may panic.
/// # Example
+ ///
/// ```rust
/// # use arrow_array::{Int32Array, types::Int32Type};
- /// # fn main() {
/// let array = Int32Array::from(vec![Some(5), Some(7), None]);
+ /// // Apply x*2+1 to the data in place, no allocations
+ /// let c = array.unary_mut(|x| x * 2 + 1).unwrap();
+ /// assert_eq!(c, Int32Array::from(vec![Some(11), Some(15), None]));
+ /// ```
+ ///
+ /// # Example: modify [`ArrayRef`] in place, if not shared
Review Comment:
This is the example I imagine someone might be looking for
##########
arrow-arith/src/arity.rs:
##########
@@ -207,23 +227,46 @@ where
Ok(PrimitiveArray::new(buffer.into(), nulls))
}
-/// Given two arrays of length `len`, calls `op(a[i], b[i])` for `i` in
`0..len`, mutating
-/// the mutable [`PrimitiveArray`] `a`. If any index is null in either `a` or
`b`, the
-/// corresponding index in the result will also be null.
+/// Applies a binary and infallible function to values in two arrays, replacing
+/// the values in the first array in place.
///
-/// Mutable primitive array means that the buffer is not shared with other
arrays.
-/// As a result, this mutates the buffer directly without allocating new
buffer.
+/// # Details
+///
+/// Given two arrays of length `len`, calls `op(a[i], b[i])` for `i` in
+/// `0..len`, modifying the [`PrimitiveArray`] `a` in place, if possible.
+///
+/// If any index is null in either `a` or `b`, the corresponding index in the
+/// result will also be null.
+///
+/// # Buffer Reuse
+///
+/// If the underlying buffers in `a` are not shared with other arrays, mutates
+/// the underlying buffer in place, without allocating.
///
/// Like [`unary`] the provided function is evaluated for every index,
ignoring validity. This
/// is beneficial when the cost of the operation is low compared to the cost
of branching, and
/// especially when the operation can be vectorised, however, requires `op` to
be infallible
/// for all possible values of its inputs
///
-/// # Error
+/// # Errors
///
-/// This function gives error if the arrays have different lengths.
-/// This function gives error of original [`PrimitiveArray`] `a` if it is not
a mutable
-/// primitive array.
+/// * if the arrays have different lengths.
+///
+/// # See Also
+///
+/// * Documentation on [`PrimitiveArray::unary_mut`] for operating on
[`ArrayRef`].
+///
+/// # Example
+/// ```
+/// # use arrow_arith::arity::binary_mut;
+/// # use arrow_array::Float32Array;
+/// # use arrow_array::types::Int32Type;
+/// let a = Float32Array::from(vec![Some(5.1f32), None, Some(6.8)]);
+/// let b = Float32Array::from(vec![Some(1.0f32), None, Some(2.0)]);
+/// // compute a + b, updating the value in a in place if possible
+/// let a = binary_mut(a, &b, |a, b| a + b).unwrap().unwrap();
Review Comment:
I found this API inconsistent for two reasons:
1. It requires both `a` and `b` to be the same type of array (where `binary`
can take different types)
2. It returns a wrapped `Result` (`Result<Result<..>>`).
@viirya I think you added this API in
https://github.com/apache/arrow-rs/pull/3144
Was it intentional to have these features?
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -732,22 +745,30 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
PrimitiveArray::from(unsafe { d.build_unchecked() })
}
- /// Applies an unary and infallible function to a primitive array.
- /// This is the fastest way to perform an operation on a primitive array
when
- /// the benefits of a vectorized operation outweigh the cost of branching
nulls and non-nulls.
+ /// Applies a unary infallible function to a primitive array, producing a
+ /// new array of potentially different type.
+ ///
+ /// This is the fastest way to perform an operation on a primitive array
+ /// when the benefits of a vectorized operation outweigh the cost of
+ /// branching nulls and non-nulls.
///
- /// # Implementation
+ /// See [`Self::unary_mut`] for in place modification.
+ ///
+ /// # Null Handling
+ ///
+ /// Applies the function for all values, including those on null slots.
This
+ /// will often allow the compiler to generate faster vectorized code, but
+ /// requires that the operation must be infallible (not error/panic) for
any
+ /// value of the corresponding type or this function may panic.
///
- /// This will apply the function for all values, including those on null
slots.
- /// This implies that the operation must be infallible for any value of
the corresponding type
- /// or this function may panic.
/// # Example
/// ```rust
- /// # use arrow_array::{Int32Array, types::Int32Type};
+ /// # use arrow_array::{Int32Array, Float32Array, types::Int32Type};
/// # fn main() {
/// let array = Int32Array::from(vec![Some(5), Some(7), None]);
- /// let c = array.unary(|x| x * 2 + 1);
- /// assert_eq!(c, Int32Array::from(vec![Some(11), Some(15), None]));
+ /// // Create a new array with the value of applying sqrt
Review Comment:
changed this to show you can make a different type
##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -480,6 +480,19 @@ pub use crate::types::ArrowPrimitiveType;
/// assert_eq!(array.values(), &[1, 0, 2]);
/// assert!(array.is_null(1));
/// ```
+///
+/// # Example: Get a `PrimitiveArray` from an [`ArrayRef`]
Review Comment:
I don't think it is obvious how to go from `Arc<dyn Array>` back to
`PrimitiveArray` so I documented that too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]