[
https://issues.apache.org/jira/browse/SPARK-56462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Szehon Ho resolved SPARK-56462.
-------------------------------
Fix Version/s: 4.2.0
Resolution: Fixed
Issue resolved by pull request 55329
[https://github.com/apache/spark/pull/55329]
> MERGE INTO schema evolution fails with UPDATE * / INSERT * when source has
> column names containing dots
> -------------------------------------------------------------------------------------------------------
>
> Key: SPARK-56462
> URL: https://issues.apache.org/jira/browse/SPARK-56462
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.1.1
> Reporter: Eric Yang
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.2.0
>
>
> When schemaEvolutionEnabled = true and a star action (UPDATE * or INSERT *)
> encounters a source column whose name contains a dot (e.g. `a.b`) that
> doesn't exist in the target, Analyzer.scala constructs
> UnresolvedAttribute(sourceAttr.name) using the apply() method. This parses
> "a.b" into nameParts = Seq("a", "b"), whereas the source AttributeReference
> returns Seq("a.b") via extractFieldPath. The resulting keyPath != valuePath
> mismatch causes isSchemaEvolutionCandidate to return false, blocking the
> schema evolution path entirely and producing an UNRESOLVED_COLUMN analysis
> error instead of adding the new column.
> It fails with something like below:
> {code:java}
> [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column, variable, or function parameter
> with name `job`.`title` cannot be resolved. Did you mean one of the
> following? [`source`.`job.title`, `source`.`dep`, `source`.`pk`,
> `source`.`salary`, `cat`.`ns1`.`test_table`.`pk`]. SQLSTATE: 42703;
> 'MergeIntoTable (pk#13054 = pk#13050), [updateaction(None,
> assignment(pk#13054, pk#13050), assignment(salary#13055, salary#13051),
> assignment(dep#13056, dep#13052), assignment('job.title, job.title#13053),
> true)], [insertaction(None, assignment(pk#13054, pk#13050),
> assignment(salary#13055, salary#13051), assignment(dep#13056, dep#13052),
> assignment('job.title, job.title#13053))], true
> :- SubqueryAlias cat.ns1.test_table
> : +- RelationV2[pk#13054, salary#13055, dep#13056] cat.ns1.test_table
> +- SubqueryAlias source
> +- View (`source`, [pk#13050, salary#13051, dep#13052, job.title#13053])
> +- Project [_1#13037 AS pk#13050, _2#13038 AS salary#13051, _3#13039 AS
> dep#13052, _4#13040 AS job.title#13053]
> +- LocalRelation [_1#13037, _2#13038, _3#13039, _4#13040] {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]