liukun4515 commented on code in PR #2360:
URL: https://github.com/apache/arrow-rs/pull/2360#discussion_r944071411


##########
arrow/src/array/array_decimal.rs:
##########
@@ -332,15 +331,30 @@ impl Decimal128Array {
 impl Decimal256Array {
     /// Validates decimal values in this array can be properly interpreted
     /// with the specified precision.
-    pub fn validate_decimal_precision(&self, precision: usize) -> Result<()> {
-        if precision < self.precision {
-            for v in self.iter().flatten() {
-                validate_decimal256_precision(&v.to_big_int(), precision)?;
+    fn validate_decimal_precision(&self, precision: usize) -> Result<()> {
+        let current_end = self.data.len();
+        let mut current: usize = 0;
+        let data = &self.data;
+
+        while current != current_end {
+            if self.is_null(current) {
+                current += 1;
+                continue;
+            } else {
+                let offset = current + data.offset();
+                current += 1;
+                let raw_val = unsafe {
+                    let pos = self.value_offset_at(offset);
+                    std::slice::from_raw_parts(
+                        self.raw_value_data_ptr().offset(pos as isize),
+                        Self::VALUE_LENGTH as usize,
+                    )
+                };
+                validate_decimal256_precision_with_lt_bytes(raw_val, 
precision)?;
             }
         }
         Ok(())

Review Comment:
   Not apply the suggestion, because of the performance regression.
   The performance of your version:
   ```
   validate_decimal256_array 20000
                           time:   [393.73 us 402.59 us 412.64 us]
                           change: [+0.0864% +1.9579% +3.8640%] (p = 0.05 < 
0.05)
                           Change within noise threshold.
   ```
   My version:
   ```
   validate_decimal256_array 20000
                           time:   [282.54 us 289.68 us 297.22 us]
                           change: [-29.976% -27.993% -25.897%] (p = 0.00 < 
0.05)
                           Performance has improved.
   ```
   I guess the reason is loop twice and create an intermediate Iter
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to