tachyonwill opened a new pull request, #13456: URL: https://github.com/apache/arrow/pull/13456
The precision calculation had been overflowing to infinity when the length of the fixed_len_byte_array > 128, triggering an error when then trying to convert infinity to an int32. We can actually simplify the logic by noting that log_b(a^(x)) = log_b(a)*x. This avoids the intermediate infinity. We also added a check for extremely large value sizes implying a max precision that cannot fit in int32. Even 129 byte decimal seems extreme. The formula Parquet C++ was using is technically incorrect vs the Parquet specification. The specification says that the max precision is floor(log_10(2^(B*8 -1) - 1)), where the C++ implementation was omitting the outer -1. However, this is okay as it is easy to prove that these values will always be the same (ignoring the realities of FP arithmetic) & in practice all three formulas agree through 128 when using FP. Bug found through fuzzing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org