sunchao opened a new pull request, #55838:
URL: https://github.com/apache/spark/pull/55838
### Why are the changes needed?
`NULLIF` builds its replacement expression before analysis has resolved all
child expressions.
For nested field references, the existing implementation can read the left
operand's data type
too early while constructing the null branch, which can fail analysis even
though the SQL shape
is valid.
SPARK-56840 tracks this analyzer failure.
### What changes were proposed in this PR?
- Build the `NULLIF` null branch with `Literal(null)` so its final type is
assigned by the normal
`If` type-coercion path instead of eagerly reading the unresolved left
operand type.
- Add a regression test that exercises `nullif(c.provider, lower(...))` on a
nested struct column
for both `ALWAYS_INLINE_COMMON_EXPR` modes.
### Does this PR introduce _any_ user-facing change?
Yes. Valid `NULLIF` expressions over unresolved nested field references that
could fail during
analysis now resolve and execute successfully.
### How was this patch tested?
- Unit tests:
- `build/sbt 'sql/testOnly org.apache.spark.sql.DataFrameFunctionsSuite --
-z "nullif function"'`
- `build/sbt 'sql/testOnly org.apache.spark.sql.ExplainSuite -- -z
"explain for these functions"'`
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Codex (GPT-5)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]