cloud-fan opened a new pull request, #55871: URL: https://github.com/apache/spark/pull/55871
Followup to https://github.com/apache/spark/pull/54722. ### What changes were proposed in this pull request? The grammar for INSERT ... REPLACE WHERE | ON unifies the two variants into `#insertIntoReplaceBooleanCond` and accepts a `tableAlias` for both, because REPLACE ON's condition can reference the target via the alias (e.g. `t.col`). The REPLACE WHERE branch in `AstBuilder` never reads `ctx.tableAlias()`, so an alias supplied to REPLACE WHERE is silently ignored. A query like ```sql INSERT INTO t AS s REPLACE WHERE s.a = 1 SELECT * FROM source ``` parses successfully, then fails at analysis with a confusing "column s.a not found" because the underlying `UnresolvedRelation` was not wrapped with the alias. This PR rejects the alias at parse time so users get a clear error pointing at the right place. The grammar stays unified (no rule split); the visitor adds a single guard before the WHERE branch's existing logic and throws a new `INSERT_REPLACE_WHERE_TABLE_ALIAS_NOT_ALLOWED` parse error that suggests REPLACE ON when an alias is needed. ### Why are the changes needed? The current behavior — silently ignoring the alias and then failing at analysis — is misleading. Either the alias should be wired through (a semantic change requiring more invasive plumbing through `OverwriteByExpression`'s write resolution path) or it should be rejected. Rejecting it at parse time is the smaller, safer fix and matches the natural reading of the grammar (an alias only makes sense when the condition references the target via the alias, which is REPLACE ON's case, not REPLACE WHERE's). ### Does this PR introduce *any* user-facing change? Yes. `INSERT INTO t AS s REPLACE WHERE …` now fails with `INSERT_REPLACE_WHERE_TABLE_ALIAS_NOT_ALLOWED` at parse time instead of silently dropping the alias and failing later (or, for queries whose WHERE doesn't reference the alias, silently producing the same plan as if the alias were absent). The new error message suggests using REPLACE ON for cases that need the alias. ### How was this patch tested? - Two existing `DDLParserSuite` tests (`insert table: REPLACE WHERE with tableAlias [and / without] BY NAME`) documented the silent-ignore behavior; they are rewritten to assert the new parse error. - Verified the rewritten tests fail without the AstBuilder guard and pass with it. ### Was this patch authored or co-authored using generative AI tooling? Yes — written with assistance from Claude. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
