Bruce Robbins created SPARK-53538:
-------------------------------------
Summary: Update with nondeterministic assigments can fail when
whole-stage codegen is off
Key: SPARK-53538
URL: https://issues.apache.org/jira/browse/SPARK-53538
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.0.0
Reporter: Bruce Robbins
This test will fail if the "split-updates" property is set to "true":
{noformat}
test("update with nondeterministic assignments and no wholestage codegen") {
val extraColCount = SQLConf.get.wholeStageMaxNumFields - 4
val schema = "pk INT NOT NULL, id INT, value DOUBLE, dep STRING, " +
((1 to extraColCount).map(i => s"col$i INT").mkString(", "))
val data = (1 to 3).map { i =>
s"""{ "pk": $i, "id": $i, "value": 2.0, "dep": "hr", """ +
((1 to extraColCount).map(j => s""""col$j": $i""").mkString(", ")) +
"}"
}.mkString("\n")
createAndInitTable(schema, data)
// rand() always generates values in [0, 1) range
sql(s"UPDATE $tableNameAsString SET value = rand() WHERE id <= 2")
checkAnswer(
sql(s"SELECT count(*) FROM $tableNameAsString WHERE value < 2.0"),
Row(2) :: Nil)
}
{noformat}
The error is:
{noformat}
[info] org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 11.0 failed 1 times, most recent failure: Lost task 0.0 in
stage 11.0 (TID 11) (10.0.0.101 executor driver):
java.lang.NullPointerException: Cannot invoke "java.util.Random.nextDouble()"
because "<parameter1>.mutableStateArray_0[0]" is null
[info] at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_0_0$(Unknown
Source)
[info] at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
[info] at
org.apache.spark.sql.execution.ExpandExec$$anon$1.next(ExpandExec.scala:75)
...
{noformat}
{{RewriteUpdateTable}} will create an {{Expand}} operator with a set of
projections, one of which will contain a nondeterministic expression.
{{ExpandExec}} fails to initialize the derived {{UnsafeProjections}} before
using them, resulting in the above error.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]