sunchao opened a new pull request, #55926:
URL: https://github.com/apache/spark/pull/55926

   ### Why are the changes needed?
   
   `NULLIF` builds its replacement expression before analysis has resolved all 
child expressions.
   For nested field references, the existing implementation can read the left 
operand's data type
   too early while constructing the null branch, which can fail analysis even 
though the SQL shape
   is valid.
   
   SPARK-56840 tracks this analyzer failure.
   
   ### What changes were proposed in this PR?
   
   - Build the `NULLIF` null branch with a lazy typed-null placeholder so 
construction does not eagerly
     read the unresolved left operand type, while `NullIf.replacement.dataType` 
remains valid once the
     operand type is available.
   - Make that placeholder `RuntimeReplaceable`, so `ReplaceExpressions` 
restores an ordinary typed
     `Literal(null, ...)` before later optimizer rules run and existing 
null-literal simplifications
     continue to apply.
   - Add focused regressions for:
     - nested struct-field `nullif(c.provider, lower(...))` analysis in both
       `ALWAYS_INLINE_COMMON_EXPR` modes;
     - `NullIf` replacement type reporting before type coercion;
     - optimizer replacement back to a normal null literal;
     - explain output avoiding exposure of the internal helper name.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. Valid `NULLIF` expressions over unresolved nested field references that 
could fail during
   analysis now resolve and execute successfully.
   
   ### How was this patch tested?
   
   - `build/sbt 'catalyst/testOnly 
org.apache.spark.sql.catalyst.expressions.NullExpressionsSuite -- -z "NullIf 
replacement preserves its data type before type coercion"'`
   - `build/sbt 'catalyst/testOnly 
org.apache.spark.sql.catalyst.optimizer.OptimizerSuite -- -z "NullIf typed null 
branch is replaced with a null literal"'`
   - `build/sbt 'sql/testOnly org.apache.spark.sql.DataFrameFunctionsSuite -- 
-z "nullif function"'`
   - `build/sbt 'sql/testOnly org.apache.spark.sql.ExplainSuite -- -z "explain 
for these functions; use range to avoid constant folding"'`
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Codex (GPT-5.5)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to