+1 Bests, Dongjoon
On Thu, Oct 10, 2019 at 10:14 Ryan Blue <rb...@netflix.com.invalid> wrote: > +1 > > Thanks for fixing this! > > On Thu, Oct 10, 2019 at 6:30 AM Xiao Li <lix...@databricks.com> wrote: > >> +1 >> >> On Thu, Oct 10, 2019 at 2:13 AM Hyukjin Kwon <gurwls...@gmail.com> wrote: >> >>> +1 (binding) >>> >>> 2019년 10월 10일 (목) 오후 5:11, Takeshi Yamamuro <linguin....@gmail.com>님이 >>> 작성: >>> >>>> Thanks for the great work, Gengliang! >>>> >>>> +1 for that. >>>> As I said before, the behaviour is pretty common in DBMSs, so the change >>>> helps for DMBS users. >>>> >>>> Bests, >>>> Takeshi >>>> >>>> >>>> On Mon, Oct 7, 2019 at 5:24 PM Gengliang Wang < >>>> gengliang.w...@databricks.com> wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> I'd like to call for a new vote on SPARK-28885 >>>>> <https://issues.apache.org/jira/browse/SPARK-28885> "Follow ANSI >>>>> store assignment rules in table insertion by default" after revising the >>>>> ANSI store assignment policy(SPARK-29326 >>>>> <https://issues.apache.org/jira/browse/SPARK-29326>). >>>>> When inserting a value into a column with the different data type, >>>>> Spark performs type coercion. Currently, we support 3 policies for the >>>>> store assignment rules: ANSI, legacy and strict, which can be set via the >>>>> option "spark.sql.storeAssignmentPolicy": >>>>> 1. ANSI: Spark performs the store assignment as per ANSI SQL. In >>>>> practice, the behavior is mostly the same as PostgreSQL. It disallows >>>>> certain unreasonable type conversions such as converting `string` to `int` >>>>> and `double` to `boolean`. It will throw a runtime exception if the value >>>>> is out-of-range(overflow). >>>>> 2. Legacy: Spark allows the store assignment as long as it is a valid >>>>> `Cast`, which is very loose. E.g., converting either `string` to `int` or >>>>> `double` to `boolean` is allowed. It is the current behavior in Spark 2.x >>>>> for compatibility with Hive. When inserting an out-of-range value to an >>>>> integral field, the low-order bits of the value is inserted(the same as >>>>> Java/Scala numeric type casting). For example, if 257 is inserted into a >>>>> field of Byte type, the result is 1. >>>>> 3. Strict: Spark doesn't allow any possible precision loss or data >>>>> truncation in store assignment, e.g., converting either `double` to `int` >>>>> or `decimal` to `double` is allowed. The rules are originally for Dataset >>>>> encoder. As far as I know, no mainstream DBMS is using this policy by >>>>> default. >>>>> >>>>> Currently, the V1 data source uses "Legacy" policy by default, while >>>>> V2 uses "Strict". This proposal is to use "ANSI" policy by default for >>>>> both >>>>> V1 and V2 in Spark 3.0. >>>>> >>>>> This vote is open until Friday (Oct. 11). >>>>> >>>>> [ ] +1: Accept the proposal >>>>> [ ] +0 >>>>> [ ] -1: I don't think this is a good idea because ... >>>>> >>>>> Thank you! >>>>> >>>>> Gengliang >>>>> >>>> >>>> >>>> -- >>>> --- >>>> Takeshi Yamamuro >>>> >>> -- >> [image: Databricks Summit - Watch the talks] >> <https://databricks.com/sparkaisummit/north-america> >> > > > -- > Ryan Blue > Software Engineer > Netflix >