Dandandan commented on code in PR #16087: URL: https://github.com/apache/datafusion/pull/16087#discussion_r2095061480
########## datafusion/functions/src/string/ascii.rs: ########## @@ -103,19 +106,29 @@ impl ScalarUDFImpl for AsciiFunc { fn calculate_ascii<'a, V>(array: V) -> Result<ArrayRef, ArrowError> where - V: ArrayAccessor<Item = &'a str>, + V: StringArrayType<'a, Item = &'a str>, { - let iter = ArrayIter::new(array); - let result = iter - .map(|string| { - string.map(|s| { - let mut chars = s.chars(); - chars.next().map_or(0, |v| v as i32) - }) - }) - .collect::<Int32Array>(); - - Ok(Arc::new(result) as ArrayRef) + let mut values = Vec::with_capacity(array.len()); Review Comment: Even faster should be (using `collect` rather than `push` and avoiding copy by using `into`): (Didn't compile the code, but it should look like this) ``` let values: Vec<_> = (0..array.len()).map(|i| { if array.is_null(i) { 0 } else { let s = array.value(i); s.chars().next().map_or(0, |c| c as i32) } }).collect(); let array = Int32Array::new(values.into(), array.nulls().cloned()); Ok(Arc::new(array)) ``` Futhermore, you can specialize for non-null arrays: ``` let values: Vec<_> = match array.nulls().filter(|n| n.null_count > 0) { Some(nulls) { // existing code }, None => { // skip null check (0..array.len()).map(|i| { let s = array.value(i); s.chars().next().map_or(0, |c| c as i32) } }).collect() } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org