jorgecarleitao commented on a change in pull request #9215:
URL: https://github.com/apache/arrow/pull/9215#discussion_r561618199



##########
File path: rust/arrow/src/array/array_primitive.rs
##########
@@ -86,13 +86,9 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     }
 
     /// Returns the primitive value at index `i`.
-    ///
-    /// Note this doesn't do any bound checking, for performance reason.
-    /// # Safety
-    /// caller must ensure that the passed in offset is less than the array 
len()
+    #[inline]
     pub fn value(&self, i: usize) -> T::Native {
-        let offset = i + self.offset();
-        unsafe { *self.raw_values.as_ptr().add(offset) }
+        self.values()[i]

Review comment:
       Thinking about it, I am not sure we want this: this will have severe 
implications to almost all uses of `value()`. Could you run the benches before 
and after this PR?
   
   My suggestion is that we mark this method as `unsafe`, so that it is clear 
that people need to be conscious about it. Since this will impact a large part 
of our code base (and likely others), I suggest that we do this change on a 
separate PR

##########
File path: rust/arrow/src/array/array_primitive.rs
##########
@@ -86,13 +86,9 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     }
 
     /// Returns the primitive value at index `i`.
-    ///
-    /// Note this doesn't do any bound checking, for performance reason.
-    /// # Safety
-    /// caller must ensure that the passed in offset is less than the array 
len()
+    #[inline]
     pub fn value(&self, i: usize) -> T::Native {
-        let offset = i + self.offset();
-        unsafe { *self.raw_values.as_ptr().add(offset) }
+        self.values()[i]

Review comment:
       Thinking about it, I am not sure we want this: this will have severe 
implications to almost all uses of `value()`. Could you run the benches before 
and after this PR?
   
   My suggestion is that we mark this method as `unsafe`, so that it is clear 
that people need to be conscious about it. Since this will impact a large part 
of our code base (and likely others), I suggest that we do this change on a 
separate PR.
   
   I think that we would like to instruct users of this API to switch to 
`values()`. This change will not trigger that change (as the code continues to 
run), but will have an amazing performance implication.
   
   Think 
   
   ```rust
   for i in 0..array.len() {
       if array.value(i) > 2 {
           // do x
       };
   }
   ```
   this loop will suffer a lot from this change.
   
   we would like users to change it to 
   
   ```
   array.values().for_each(|value| {
       if value > 2 {
           // do x
       };
   });
   ```
   

##########
File path: rust/arrow/src/array/array_primitive.rs
##########
@@ -86,13 +86,9 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     }
 
     /// Returns the primitive value at index `i`.
-    ///
-    /// Note this doesn't do any bound checking, for performance reason.
-    /// # Safety
-    /// caller must ensure that the passed in offset is less than the array 
len()
+    #[inline]
     pub fn value(&self, i: usize) -> T::Native {
-        let offset = i + self.offset();
-        unsafe { *self.raw_values.as_ptr().add(offset) }
+        self.values()[i]

Review comment:
       Thinking about it, I am not sure we want this: this will have severe 
implications to almost all uses of `value()`. Could you run the benches before 
and after this PR?
   
   My suggestion is that we mark this method as `unsafe`, so that it is clear 
that people need to be conscious about it. Since this will impact a large part 
of our code base (and likely others), I suggest that we do this change on a 
separate PR.
   
   I think that we would like to instruct users of this API to switch to 
`values()`. This change will not trigger that change (as the code continues to 
run), but will have an amazing performance implication.
   
   Think 
   
   ```rust
   for i in 0..array.len() {
       if array.value(i) > 2 {
           // do x
       };
   }
   ```
   this loop will suffer a lot from this change.
   
   we would like users to change it to 
   
   ```rust
   array.values().for_each(|value| {
       if value > 2 {
           // do x
       };
   });
   ```
   

##########
File path: rust/arrow/src/array/array_primitive.rs
##########
@@ -86,13 +86,9 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     }
 
     /// Returns the primitive value at index `i`.
-    ///
-    /// Note this doesn't do any bound checking, for performance reason.
-    /// # Safety
-    /// caller must ensure that the passed in offset is less than the array 
len()
+    #[inline]
     pub fn value(&self, i: usize) -> T::Native {
-        let offset = i + self.offset();
-        unsafe { *self.raw_values.as_ptr().add(offset) }
+        self.values()[i]

Review comment:
       Thinking about it, I am not sure we want this: this will have severe 
implications to almost all uses of `value()`. Could you run the benches before 
and after this PR?
   
   My suggestion is that we mark this method as `unsafe`, so that it is clear 
that people need to be conscious about it. Since this will impact a large part 
of our code base (and likely others), I suggest that we do this change on a 
separate PR.
   
   I think that we would like to instruct users of this API to switch to 
`values()`. This change will not trigger that change (as the code continues to 
run), but will have an amazing performance implication.
   
   Think 
   
   ```rust
   for i in 0..array.len() {
       if array.value(i) > 2 {
           // do x
       };
   }
   ```
   this loop will suffer a lot from this change.
   
   we would like users to change it to 
   
   ```rust
   array.values().for_each(|value| {
       if value > 2 {
           // do x
       };
   });
   ```
   
   For that, we need to mark `array.value(i)` as `unsafe` to indicate that yes, 
you can use that method, yes, it is fast, _but_ you need to be careful about 
`i`.

##########
File path: rust/arrow/src/array/array_primitive.rs
##########
@@ -86,13 +86,9 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     }
 
     /// Returns the primitive value at index `i`.
-    ///
-    /// Note this doesn't do any bound checking, for performance reason.
-    /// # Safety
-    /// caller must ensure that the passed in offset is less than the array 
len()
+    #[inline]
     pub fn value(&self, i: usize) -> T::Native {
-        let offset = i + self.offset();
-        unsafe { *self.raw_values.as_ptr().add(offset) }
+        self.values()[i]

Review comment:
       I think that we may need to park this change until we migrate our code 
base to use `values()` whenever possible. 
   
   If we merge this one, we get a major performance degradation. If we add 
`unsafe`, we need to add `unsafe` in a lot of places. Neither are great options 
:)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to