viirya commented on PR #3690: URL: https://github.com/apache/arrow-rs/pull/3690#issuecomment-1456967742
> My understanding is this is trying to emulate the behaviour of Spark decimals, which store precision and scale per value. However, I'm not sure this PR actually achieves this, in the event of overflow this PR will truncate the precision of all values in the array, potentially truncating values that wouldn't have been truncated by Spark. I'm therefore not really sure what this PR gains us, it trades a behaviour change that results in an error, to a behaviour change that results in silent truncation? Am I missing something here? In Spark, decimal expression will be followed by an expression to round decimals. Spark analyzer does type coercion for decimal arithmetic operations and inserts such expression with required precision/scale. To emulate this too, we have a similar expression so that it produces decimals with same precision/scale as Spark. In Spark, precision/scale is stored in decimal objects. Once truncating happens, it only truncates what needed to be truncated. In the follow up expression, it will be adjusted to required precision/scale if possible. But as we just have i128 values, we cannot just truncate single value there as it results in inconsistent precision/scale. So it turns out to be current approach that truncates all values to same precision/scale at the end of this kernel. We have a similar expression as mentioned above that adjusts decimals to required precision/scale for the arithmetic operation. Alternative thing to do for this, I think, is to produce an internal struct of i128, precision and scale during the operation. So we don't need to truncate other values. The kernel also takes required precision/scale as parameters and adjusts decimal values accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
