Good day,

Recently we have faced an issue that was pinpointed to the following situation - https://github.com/YevIgn/spark-case-issue

Basically query in question has differently ordered summation of three columns (`1` + `2` + `3`) next (`1` + `3` + `2`) in a CASE and fails with the following exception:

: java.lang.IllegalStateException: Cannot update expression: (input[1, bigint, true] + input[2, bigint, true]) in map: Map() with use count: -1     at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.updateExprInMap(EquivalentExpressions.scala:85)     at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.updateExprTree(EquivalentExpressions.scala:198)     at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$updateExprTree$1(EquivalentExpressions.scala:200)

    at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$updateExprTree$1$adapted(EquivalentExpressions.scala:200)

Which is referring to the following code lines - https://github.com/apache/spark/blob/c4a7588cbd5febc50054253da679198e741025a6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala#L80

We found out that making order same avoids the error as in the example by link above. Am I correct that this is a bug with how first query is parsed?

Messaging about it in case it's, to make it visible and potentially fixed in future.

Best regards,
Evgenii Ignatev

Reply via email to