alamb commented on code in PR #20344:
URL: https://github.com/apache/datafusion/pull/20344#discussion_r2805822056


##########
datafusion/functions/src/string/replace.rs:
##########
@@ -193,7 +195,10 @@ fn replace<T: OffsetSizeTrait>(args: &[ArrayRef]) -> 
Result<ArrayRef> {
     let from_array = as_generic_string_array::<T>(&args[1])?;
     let to_array = as_generic_string_array::<T>(&args[2])?;
 
-    let mut builder = GenericStringBuilder::<T>::new();
+    let mut builder = GenericStringBuilder::<T>::with_capacity(
+        string_array.len(),
+        string_array.values().len(),

Review Comment:
   FWIW values.len() returns the entire buffer -- so if the array was sliced 
(so it represents 100 values from a buffer of 1000, this is going to allocate 
space for the entire buffer)
   
   You could find the actual string length by looking at the offsets
   
   Also this allocation doesn't take into account the fact that the replacement 
string could expand or reduce the output text



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to