[ https://issues.apache.org/jira/browse/SPARK-48921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kent Yao updated SPARK-48921: ----------------------------- Fix Version/s: 3.5.2 (was: 3.5.3) > ScalaUDF in subquery should run through analyzer > ------------------------------------------------ > > Key: SPARK-48921 > URL: https://issues.apache.org/jira/browse/SPARK-48921 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 4.0.0, 3.5.1, 3.4.3 > Reporter: L. C. Hsieh > Assignee: L. C. Hsieh > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.5.2 > > > We got a customer issue that a `MergeInto` query on Iceberg table works > earlier but cannot work after upgrading to Spark 3.4. > The error looks like > ``` > Caused by: org.apache.spark.SparkRuntimeException: Error while decoding: > org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to > nullable on unresolved object > upcast(getcolumnbyordinal(0, StringType), StringType, - root class: > java.lang.String).toString. > ``` > The source table of `MergeInto` uses `ScalaUDF`. The error happens when Spark > invokes the deserializer of input encoder of the `ScalaUDF` and the > deserializer is not resolved yet. > The encoders of ScalaUDF are resolved by the rule `ResolveEncodersInUDF` > which will be applied at the end of analysis phase. > During rewriting `MergeInto` to `ReplaceData` query, Spark creates an > `Exists` subquery and `ScalaUDF` is part of the plan of the subquery. Note > that the `ScalaUDF` is already resolved by the analyzer. > Then, in `ResolveSubquery` rule which resolves the subquery, it will resolve > the subquery plan if it is not resolved yet. Because the subquery containing > `ScalaUDF` is resolved, the rule skips it so `ResolveEncodersInUDF` won't be > applied on it. So the analyzed `ReplaceData` query contains a `ScalaUDF` with > encoders unresolved that cause the error. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org