[ https://issues.apache.org/jira/browse/SPARK-28495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gengliang Wang updated SPARK-28495: ----------------------------------- Issue Type: Sub-task (was: Improvement) Parent: SPARK-28589 > Follow ANSI SQL on table insertion > ---------------------------------- > > Key: SPARK-28495 > URL: https://issues.apache.org/jira/browse/SPARK-28495 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.0.0 > Reporter: Gengliang Wang > Priority: Major > > In Spark version 2.4 and earlier, when inserting into a table, Spark will > cast the data type of input query to the data type of target table by > coercion. This can be super confusing, e.g. users make a mistake and write > string values to an int column. > In data source V2, by default, only upcasting is allowed when inserting data > into a table. E.g. int -> long and int -> string are allowed, while decimal > -> double or long -> int are not allowed. The rules of UpCast was originally > created for Dataset type coercion. They are quite strict and different from > the behavior of all existing popular DBMS. This is breaking change. It is > possible that it would hurt some Spark users after 3.0 releases. > This PR proposes that we can follow the rules of store assignment(section > 9.2) in ANSI SQL. Two significant differences from Up-Cast: > 1. Any numeric type can be assigned to another numeric type. > 2. TimestampType can be assigned DateType > The new behavior is consistent with PostgreSQL. It is more explainable and > acceptable than using UpCast . -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org