[ 
https://issues.apache.org/jira/browse/SPARK-28495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang updated SPARK-28495:
-----------------------------------
    Issue Type: Sub-task  (was: Improvement)
        Parent: SPARK-28589

> Follow ANSI SQL on table insertion
> ----------------------------------
>
>                 Key: SPARK-28495
>                 URL: https://issues.apache.org/jira/browse/SPARK-28495
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Gengliang Wang
>            Priority: Major
>
> In Spark version 2.4 and earlier, when inserting into a table, Spark will 
> cast the data type of input query to the data type of target table by 
> coercion. This can be super confusing, e.g. users make a mistake and write 
> string values to an int column.
> In data source V2,  by default, only upcasting is allowed when inserting data 
> into a table. E.g. int -> long and int -> string are allowed, while decimal 
> -> double or long -> int are not allowed. The rules of UpCast was originally 
> created for Dataset type coercion. They are quite strict and different from 
> the behavior of all existing popular DBMS. This is breaking change. It is 
> possible that it would hurt some Spark users after 3.0 releases.
> This PR proposes that we can follow the rules of store assignment(section 
> 9.2) in ANSI SQL. Two significant differences from Up-Cast:
> 1. Any numeric type can be assigned to another numeric type.
> 2. TimestampType can be assigned DateType
> The new behavior is consistent with PostgreSQL. It is more explainable and 
> acceptable than using UpCast .



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to