Dandandan commented on code in PR #16087:
URL: https://github.com/apache/datafusion/pull/16087#discussion_r2095061480


##########
datafusion/functions/src/string/ascii.rs:
##########
@@ -103,19 +106,29 @@ impl ScalarUDFImpl for AsciiFunc {
 
 fn calculate_ascii<'a, V>(array: V) -> Result<ArrayRef, ArrowError>
 where
-    V: ArrayAccessor<Item = &'a str>,
+    V: StringArrayType<'a, Item = &'a str>,
 {
-    let iter = ArrayIter::new(array);
-    let result = iter
-        .map(|string| {
-            string.map(|s| {
-                let mut chars = s.chars();
-                chars.next().map_or(0, |v| v as i32)
-            })
-        })
-        .collect::<Int32Array>();
-
-    Ok(Arc::new(result) as ArrayRef)
+    let mut values = Vec::with_capacity(array.len());

Review Comment:
   Even faster should be (using `collect` rather than `push` and avoiding copy 
by using `into`):
   
   (Didn't compile the code, but it should look like this)
   
   ```
   let values: Vec<_> = (0..array.len()).map(|i| {
         if array.is_null(i) {
               0
           } else {
               let s = array.value(i);
               s.chars().next().map_or(0, |c| c as i32)
           }
   }).collect();
   
   let array = Int32Array::new(values.into(), array.nulls().cloned());
   Ok(Arc::new(array))
   ```
   
   Futhermore, you can specialize for non-null arrays:
   
   ```
   let values: Vec<_> = 
       match array.nulls().filter(|n| n.null_count > 0) {
            Some(nulls) {
                // existing code
            },
            None => {
                // skip null check
                (0..array.len()).map(|i| {
                    let s = array.value(i);
                    s.chars().next().map_or(0, |c| c as i32)
                }
               }).collect()
         }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to