AlenkaF commented on issue #36388:
URL: https://github.com/apache/arrow/issues/36388#issuecomment-1774746421

   > It appears that in order to avoid overflow in this issue, the following 
two scenarios need to be considered:
   > 
   > 1. Initially, is the `repetition count`(e.g., `2**31 -1`) greater than 
`int32_t` or less than `0`?
   > 2. Does the product of the `repeated value` (e.g., `?`) and `repetition 
count` trigger `MultiplyWithOverflow`?
   
   The main thing I would check is that value size multiplied with number of 
repetitions passes `MultiplyWithOverflow` when creating offsets buffer for the 
array with repeated values.
   
   > Moreover, it seems possible to allocate the `repetition count` within the 
`int64_t` range, but when calling the c++ function from python code, is there a 
type conversion happening, leading the user to discover an error related to 
`int32_t`?
   
   I do not think there are any conversions happening. The `int64_t` range is 
the limit when creating the offset buffer and not a limit to the number of 
repetitions. Restricting the number of repetitions will not work, _I think_. 
For example if the string has a bigger value (length) then the limit for the 
number of repetitions will be even lower?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to