[ https://issues.apache.org/jira/browse/SPARK-40903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gengliang Wang resolved SPARK-40903. ------------------------------------ Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38379 [https://github.com/apache/spark/pull/38379] > Avoid reordering decimal Add for canonicalization > ------------------------------------------------- > > Key: SPARK-40903 > URL: https://issues.apache.org/jira/browse/SPARK-40903 > Project: Spark > Issue Type: Test > Components: SQL > Affects Versions: 3.4.0 > Reporter: Gengliang Wang > Assignee: Gengliang Wang > Priority: Major > Fix For: 3.4.0 > > > Avoid reordering Add for canonicalizing if it is decimal type. > Expressions are canonicalized for comparisons and explanations. For > non-decimal Add expression, the order can be sorted by hashcode, and the > result is supposed to be the same. > However, for Add expression of Decimal type, the behavior is different: Given > decimal (p1, s1) and another decimal (p2, s2), the result integral part is > `max(p1-s1, p2-s2) +1`, the result decimal part is `max(s1, s2)`. Thus the > result data type is `(max(p1-s1, p2-s2) +1 + max(s1, s2), max(s1, s2))`. > Thus the order matters: > * For `(decimal(12,5) + decimal(12,6)) + decimal(3, 2)`, the first add > `decimal(12,5) + decimal(12,6)` results in `decimal(14, 6)`, and then > `decimal(14, 6) + decimal(3, 2)` results in `decimal(15, 6)` > * For `(decimal(12, 6) + decimal(3,2)) + decimal(12, 5)`, the first add > `decimal(12, 6) + decimal(3,2)` results in `decimal(13, 6)`, and then > `decimal(13, 6) + decimal(12, 5)` results in `decimal(14, 6)` > In the following query: > ``` > create table foo(a decimal(12, 5), b decimal(12, 6)) using orc > select sum(coalesce(a+b+ 1.75, a)) from foo > ``` > At first `coalesce(a+b+ 1.75, a)` is resolved as `coalesce(a+b+ 1.75, cast(a > as decimal(15, 6))`. In the canonicalized version, the expression becomes > `coalesce(1.75+b+a, cast(a as decimal(15, 6))`. As explained above, > `1.75+b+a` is of decimal(14, 6), which is different from `cast(a as > decimal(15, 6)`. Thus the following error will happen: > {code:java} > java.lang.IllegalArgumentException: requirement failed: All input types must > be the same except nullable, containsNull, valueContainsNull flags. The input > types found are > DecimalType(14,6) > DecimalType(15,6) > at scala.Predef$.require(Predef.scala:281) > at > org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck(Expression.scala:1149) > at > org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck$(Expression.scala:1143) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org