Good day,
Recently we have faced an issue that was pinpointed to the following
situation - https://github.com/YevIgn/spark-case-issue
Basically query in question has differently ordered summation of three
columns (`1` + `2` + `3`) next (`1` + `3` + `2`) in a CASE and fails
with the following exception:
: java.lang.IllegalStateException: Cannot update expression: (input[1,
bigint, true] + input[2, bigint, true]) in map: Map() with use count: -1
at
org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.updateExprInMap(EquivalentExpressions.scala:85)
at
org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.updateExprTree(EquivalentExpressions.scala:198)
at
org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$updateExprTree$1(EquivalentExpressions.scala:200)
at
org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$updateExprTree$1$adapted(EquivalentExpressions.scala:200)
Which is referring to the following code lines -
https://github.com/apache/spark/blob/c4a7588cbd5febc50054253da679198e741025a6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala#L80
We found out that making order same avoids the error as in the example
by link above. Am I correct that this is a bug with how first query is
parsed?
Messaging about it in case it's, to make it visible and potentially
fixed in future.
Best regards,
Evgenii Ignatev