lriggs commented on code in PR #50187:
URL: https://github.com/apache/arrow/pull/50187#discussion_r3438187179


##########
cpp/src/gandiva/precompiled/string_ops.cc:
##########
@@ -1946,9 +1914,33 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, 
const char* text,
                                    gdv_int32 text_len, const char* from_str,
                                    gdv_int32 from_str_len, const char* to_str,
                                    gdv_int32 to_str_len, gdv_int32* out_len) {
+  // Count non-overlapping matches to size the output buffer exactly, so large
+  // results are not capped by an arbitrary limit.
+  gdv_int64 num_matches = 0;

Review Comment:
   old vs new is measured without a separate build: the old wrapper was 
literally replace_with_max_len_utf8_utf8_utf8(…, 65535, …), and that inner 
function is unchanged. So new = replace_utf8_utf8_utf8 (counting scan + exact 
alloc) and old = replace_with_max_len_utf8_utf8_utf8 given an adequately-sized 
buffer (the pre-change algorithm, no scan). The delta is purely the O(n) scan.
   
   Results 
   
   
   | case                |           text_len  | matches |  new ns/call | old 
ns/call  |   delta | 
   |----|------|-----|-----|----------|-----------|
   |small/dense  |expand a->ab        |   256    |   256     |  3013.5   |    
2250.3  |  +33.9%|
   |small/sparse |expand a->ab     |      256     |    4    |   1738.6    |    
922.1  |  +88.5%|
   |medium/dense | expand a->ab   |     65536  |   65536   |  753132.4 |    
573520.9  |  +31.3%|
   |medium/sparse |expand a->ab  |      65536   |   1024  |   436435.6   |  
224018.2  |  +94.8%|
   |large/dense | expand a->ab   |    4194304  | 4194304  | 52924267.0  | 
40737654.2  |  +29.9%|
   |large/sparse expand |a->ab   |    4194304   |  65536  | 28408644.3 |  
14630748.5  |  +94.2%|
   |large/dense  |shrink ab->a   |    4194304    |     0  | 12771361.1 |  
12659342.5   |  +0.9%|
   
   Interpretation:
   
   The scan's relative cost is consistent across sizes (~30% dense, ~90% 
sparse), as expected for an O(n) pass — it scales with input length, not match 
count.
   Sparse is the worst case (+90%): with few matches, the old replace pass is 
mostly cheap byte-copying, so adding a full second comparison pass nearly 
doubles the work. Dense (+30%): the replace pass already does 
expansion-copying, so the scan is a smaller fraction.
   Shrink path: +0.9% — confirms the refinement works; when to ≤ from the scan 
is skipped and there's no measurable overhead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to