abtom87 commented on PR #49813:
URL: https://github.com/apache/arrow/pull/49813#issuecomment-4428355408

   > ## Pull request overview
   > 
   > Copilot reviewed 4 out of 4 changed files in this pull request and 
generated 1 comment.
   > Comments suppressed due to low confidence (1)
   > 
   > **cpp/src/gandiva/gdv_string_function_stubs.cc:782**
   > 
   >     * In the multibyte branch, the loops advance by `len_char_in` / 
`len_char_from` computed via `gdv_fn_utf8_char_length()`, which returns 0 for 
invalid leading bytes. If an input contains an invalid byte >127 (e.g., a 
continuation byte 0x80), `len_char_in` can be 0 and `in_for += len_char_in` 
will never progress, causing an infinite loop (DoS). Add checks for `len_char_* 
<= 0` (and for `start_compare` increments) to set an invalid-UTF8 error and 
abort safely.
   > 
   > 
   > ```
   >     for (int32_t in_for = 0; in_for < in_len; in_for += len_char_in) {
   >       // Updating len to char in this position
   >       len_char_in = gdv_fn_utf8_char_length(in[in_for]);
   >       // Making copy to std::string with length for this char position
   >       std::string insert_copy_key(in + in_for, len_char_in);
   > ```
   
   @kou I believe the addressed issue is not introduced with this change. It 
was there previously.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to