AlenkaF commented on issue #36388: URL: https://github.com/apache/arrow/issues/36388#issuecomment-1774746421
> It appears that in order to avoid overflow in this issue, the following two scenarios need to be considered: > > 1. Initially, is the `repetition count`(e.g., `2**31 -1`) greater than `int32_t` or less than `0`? > 2. Does the product of the `repeated value` (e.g., `?`) and `repetition count` trigger `MultiplyWithOverflow`? The main thing I would check is that value size multiplied with number of repetitions passes `MultiplyWithOverflow` when creating offsets buffer for the array with repeated values. > Moreover, it seems possible to allocate the `repetition count` within the `int64_t` range, but when calling the c++ function from python code, is there a type conversion happening, leading the user to discover an error related to `int32_t`? I do not think there are any conversions happening. The `int64_t` range is the limit when creating the offset buffer and not a limit to the number of repetitions. Restricting the number of repetitions will not work, _I think_. For example if the string has a bigger value (length) then the limit for the number of repetitions will be even lower? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org