[PR] [SPARK-53482][SQL] MERGE INTO support nested case where source has less fields than target [spark]

via GitHub Wed, 03 Sep 2025 18:46:53 -0700


szehon-ho opened a new pull request, #52225:
URL: https://github.com/apache/spark/pull/52225


   
   
   ### What changes were proposed in this pull request?
   Support MERGE INTO where source has less fields than target.  This is 
already partially supported as part of: 
https://github.com/apache/spark/pull/51698, but only for top level fields.  
This support it even for nested fields.
   
   The basic idea is:  
   - For MERGE INTO with `UPDATE *` and `INSERT *`, 
https://github.com/apache/spark/pull/51698 already changed it to resolve to 
source table schema instead of target table schema for top level field.  This 
change now resolves it to flattened leaf fields of source table schema.
   - Previously INSERT did not allow specifying a leaf field.  Add this 
support, for this change to work.  The logic is similar to UPDATE
   
   
   ### Why are the changes needed?
   For cases where source has less fields than target in MERGE INTO, it should 
behave more gracefully (inserting null values where source field does not 
exist).
   
   
   ### Does this PR introduce _any_ user-facing change?
   No, only that this scenario used to fail and will now pass.
   
   
   ### How was this patch tested?
   Add unit test to MergeIntoTableSuiteBase
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[PR] [SPARK-53482][SQL] MERGE INTO support nested case where source has less fields than target [spark]

Reply via email to