Szehon Ho created SPARK-54595:
---------------------------------

             Summary: Bind implicit schema coercion behavior with WITH SCHEMA 
EVOLUTION clause
                 Key: SPARK-54595
                 URL: https://issues.apache.org/jira/browse/SPARK-54595
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.1.0
            Reporter: Szehon Ho


As [~aokolnychyi] tested this feature, he mentioned that as of Spark 4.1 the 
behavior is changed for MERGE INTO with UPDATE * , but without SCHEMA EVOLUTION 
clause.  

In particular:
* Source has less columns/nested fields than target => we fill with NULL or 
DEFAULT for inserts, and existing value for Update.  (though we disabled for 
nested structs by default in SPARK-54525)
* Source has more columns/fields than target => we drop the extra fields.

Initially, I thought its a good improvement of MERGE INTO, but Anton has a good 
point that it may be a surprise to some user.  So it may be better for now to 
be more conservative and keep the exact same behavior for without SCHEMA 
EVOLUTION clause, and relax it later once there is more clarity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to