Query fails on CASE statement depending on order of summed columns

Evgenii Ignatev Wed, 22 Nov 2023 02:16:03 -0800

Good day,

Recently we have faced an issue that was pinpointed to the followingsituation - https://github.com/YevIgn/spark-case-issue

Basically query in question has differently ordered summation of threecolumns (`1` + `2` + `3`) next (`1` + `3` + `2`) in a CASE and failswith the following exception:

: java.lang.IllegalStateException: Cannot update expression: (input[1,bigint, true] + input[2, bigint, true]) in map: Map() with use count: -1 atorg.apache.spark.sql.catalyst.expressions.EquivalentExpressions.updateExprInMap(EquivalentExpressions.scala:85) atorg.apache.spark.sql.catalyst.expressions.EquivalentExpressions.updateExprTree(EquivalentExpressions.scala:198) atorg.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$updateExprTree$1(EquivalentExpressions.scala:200)

atorg.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$updateExprTree$1$adapted(EquivalentExpressions.scala:200)

Which is referring to the following code lines -https://github.com/apache/spark/blob/c4a7588cbd5febc50054253da679198e741025a6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala#L80

We found out that making order same avoids the error as in the exampleby link above. Am I correct that this is a bug with how first query isparsed?

Messaging about it in case it's, to make it visible and potentiallyfixed in future.


Best regards,
Evgenii Ignatev

Query fails on CASE statement depending on order of summed columns

Reply via email to