Copilot commented on code in PR #19558:
URL: https://github.com/apache/datafusion/pull/19558#discussion_r2652619305


##########
datafusion/functions/src/unicode/rpad.rs:
##########
@@ -268,18 +272,20 @@ where
                                     );
                                 }
                                 let length = if length < 0 { 0 } else { length 
as usize };
-                                let graphemes =
-                                    
string.graphemes(true).collect::<Vec<&str>>();
+                                // Reuse buffer by clearing and refilling
+                                graphemes_buf.clear();
+                                graphemes_buf.extend(string.graphemes(true));
 
-                                if length < graphemes.len() {
-                                    
builder.append_value(graphemes[..length].concat());
+                                if length < graphemes_buf.len() {
+                                    builder
+                                        
.append_value(graphemes_buf[..length].concat());
                                 } else if fill.is_empty() {
                                     builder.append_value(string);
                                 } else {
                                     builder.write_str(string)?;
                                     fill.chars()
                                         .cycle()
-                                        .take(length - graphemes.len())
+                                        .take(length - graphemes_buf.len())
                                         .for_each(|ch| 
builder.write_char(ch).unwrap());

Review Comment:
   The fill characters should be pre-collected into a reusable buffer similar 
to how it's done in lpad.rs. Currently, fill.chars() is being called on every 
row which allocates a new iterator each time. This misses an optimization 
opportunity mentioned in the PR description: "For lpad with fill parameter: 
Eliminates 2 Vec allocations per row (graphemes + fill_chars)". The rpad 
function should also benefit from the same fill_chars_buf optimization by 
declaring a fill_chars_buf at line 219 and reusing it by clearing and extending 
with fill.chars() before this usage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to