HaoYang670 commented on code in PR #1832: URL: https://github.com/apache/arrow-rs/pull/1832#discussion_r894282254
########## arrow/src/compute/kernels/substring.rs: ########## @@ -182,24 +182,66 @@ pub fn substring_by_char<OffsetSize: OffsetSizeTrait>( start: i64, length: Option<u64>, ) -> Result<GenericStringArray<OffsetSize>> { - Ok(array - .iter() - .map(|val| { - val.map(|val| { - let char_count = val.chars().count(); - let start = if start >= 0 { - start.to_usize().unwrap().min(char_count) - } else { - char_count - (-start).to_usize().unwrap().min(char_count) - }; - let length = length.map_or(char_count - start, |length| { - length.to_usize().unwrap().min(char_count - start) - }); + let mut vals = BufferBuilder::<u8>::new(array.value_data().len()); + let mut offsets = BufferBuilder::<OffsetSize>::new(array.len() + 1); + offsets.append(OffsetSize::zero()); + let length = length.map(|len| len.to_usize().unwrap()); + + array.iter().for_each(|val| { + if let Some(val) = val { + let char_count = val.chars().count(); + let start = if start >= 0 { + start.to_usize().unwrap() Review Comment: I am curious about why `to_usize()` returns `Option<usize>` but not `Result<usize>`. Unfortunately, the rust std library also chooses `Option`: ```rust /// Converts the value of `self` to a `usize`. If the value cannot be /// represented by a `usize`, then `None` is returned. #[inline] fn to_usize(&self) -> Option<usize> { self.to_u64().as_ref().and_then(ToPrimitive::to_usize) } ``` But `Result` is more reasonable for me. Because we could tell users the reason of casting failure : ```rust #[inline] fn to_usize(&self) -> Result<usize> { match num::ToPrimitive::to_usize(self) { Some(val) => Ok(val), None => Err(ArrowError::CastError(format!("Cannot cast {} to usize")) } } ``` And for unsupported types, we can have a default implementation: ```rust /// Convert native type to usize. #[inline] fn to_usize(&self) -> Result<usize> { Err(ArrowError::CastError(format!("Casting {} to usize is not supported.", type_name::<Self>()))) } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org