[ https://issues.apache.org/jira/browse/SPARK-37779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17466391#comment-17466391 ]
Apache Spark commented on SPARK-37779: -------------------------------------- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/35058 > Make ColumnarToRowExec plan canonicalizable after (de)serialization > ------------------------------------------------------------------- > > Key: SPARK-37779 > URL: https://issues.apache.org/jira/browse/SPARK-37779 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 > Reporter: Hyukjin Kwon > Priority: Minor > > SPARK-23731 fixed the plans to be serializable by leveraging lazy but > SPARK-28213 introduced new code path that calls the lazy val which triggers > null point exception in > https://github.com/apache/spark/blob/77b164aac9764049a4820064421ef82ec0bc14fb/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala#L68 > This can fail during canonicalization during, for example, eliminating sub > common expressions: > {code} > java.lang.NullPointerException > at > org.apache.spark.sql.execution.FileSourceScanExec.supportsColumnar$lzycompute(DataSourceScanExec.scala:280) > at > org.apache.spark.sql.execution.FileSourceScanExec.supportsColumnar(DataSourceScanExec.scala:279) > at > org.apache.spark.sql.execution.InputAdapter.supportsColumnar(WholeStageCodegenExec.scala:509) > at > org.apache.spark.sql.execution.ColumnarToRowExec.<init>(Columnar.scala:67) > ... > at > org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:581) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:580) > at > org.apache.spark.sql.execution.ScalarSubquery.canonicalized$lzycompute(subquery.scala:110) > ... > at > org.apache.spark.sql.catalyst.expressions.ExpressionEquals.hashCode(EquivalentExpressions.scala:275) > ... > at scala.collection.mutable.HashTable.findEntry$(HashTable.scala:135) > at scala.collection.mutable.HashMap.findEntry(HashMap.scala:44) > at scala.collection.mutable.HashMap.get(HashMap.scala:74) > at > org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExpr(EquivalentExpressions.scala:46) > at > org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExprTreeHelper$1(EquivalentExpressions.scala:147) > at > org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExprTree(EquivalentExpressions.scala:170) > at > org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.$anonfun$proxyExpressions$1(SubExprEvaluationRuntime.scala:89) > at > org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.$anonfun$proxyExpressions$1$adapted(SubExprEvaluationRuntime.scala:89) > at scala.collection.immutable.List.foreach(List.scala:392) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org