mikhailnik-db opened a new pull request, #55986: URL: https://github.com/apache/spark/pull/55986
### What changes were proposed in this pull request? Skip the non-recursive `CTERelationRef` schema snapshot in `ResolveWithCTE` while the matching `CTERelationDef` still contains an unresolved `SQLFunctionExpression`. A subsequent fixed-point iteration retries the substitution once `ResolveSQLFunctions` has inlined the UDF body. ### Why are the changes needed? `SQLFunctionExpression` hard-codes `nullable = true` but is `resolved` as soon as its inputs resolve. `CTERelationRef.output` is a `val` snapshot of `cteDef.output`, so capturing it before the UDF inlines freezes `nullable = true`. For nested UDF calls like `wrap_int(non_null_one())`, the outer placeholder survives one analyzer iteration (`ResolveSQLFunctions` skips UDFs whose inputs themselves contain a `SQLFunctionExpression`). `ResolveWithCTE`, which runs later in the same batch, snapshots the still-incorrect output, and the `!ref.resolved` gate prevents a fix-up on the next iteration. Single-level UDF cases inline fully in iter 1 and avoid the bug. Recursive CTEs are unaffected: they already force `withNullability(true)` by design. ### Does this PR introduce _any_ user-facing change? Yes. CTE columns wrapping nested non-nullable SQL UDFs now report `nullable = false` instead of `nullable = true`. Row-level results are unchanged. Before: ```sql CREATE FUNCTION non_null_one() RETURNS INT RETURN 1; CREATE FUNCTION wrap_int(x INT) RETURNS INT RETURN x; WITH cte AS (SELECT wrap_int(non_null_one()) AS x) SELECT * FROM cte; -- x: int (nullable = true) ``` After: ``` -- x: int (nullable = false) ``` ### How was this patch tested? New regression test in `SQLFunctionSuite` (`SPARK-56945: CTE preserves non-nullable SQL UDF body in materialized schema`). Fails on master with `x: integer (nullable = true)`, passes with this PR. No SQL UDF golden file diffs. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude (Anthropic, Claude Code, Opus 4.7) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
