Bruce Robbins created SPARK-53538:
-------------------------------------

             Summary: Update with nondeterministic assigments can fail when 
whole-stage codegen is off
                 Key: SPARK-53538
                 URL: https://issues.apache.org/jira/browse/SPARK-53538
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Bruce Robbins


This test will fail if the "split-updates" property is set to "true":
{noformat}
  test("update with nondeterministic assignments and no wholestage codegen") {
    val extraColCount = SQLConf.get.wholeStageMaxNumFields - 4
    val schema = "pk INT NOT NULL, id INT, value DOUBLE, dep STRING, " +
      ((1 to extraColCount).map(i => s"col$i INT").mkString(", "))
    val data = (1 to 3).map { i =>
      s"""{ "pk": $i, "id": $i, "value": 2.0, "dep": "hr", """ +
        ((1 to extraColCount).map(j => s""""col$j": $i""").mkString(", ")) +
      "}"
    }.mkString("\n")
    createAndInitTable(schema, data)

    // rand() always generates values in [0, 1) range
    sql(s"UPDATE $tableNameAsString SET value = rand() WHERE id <= 2")

    checkAnswer(
      sql(s"SELECT count(*) FROM $tableNameAsString WHERE value < 2.0"),
      Row(2) :: Nil)
  }
{noformat}
The error is:
{noformat}
[info]   org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 11.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 11.0 (TID 11) (10.0.0.101 executor driver):
java.lang.NullPointerException: Cannot invoke "java.util.Random.nextDouble()" 
because "<parameter1>.mutableStateArray_0[0]" is null
[info]  at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown
 Source)
[info]  at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)
[info]  at 
org.apache.spark.sql.execution.ExpandExec$$anon$1.next(ExpandExec.scala:75)
...
{noformat}

{{RewriteUpdateTable}} will create an {{Expand}} operator with a set of 
projections, one of which will contain a nondeterministic expression. 
{{ExpandExec}} fails to initialize the  derived {{UnsafeProjections}} before 
using them, resulting in the above error.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to