Re: [DISCUSS] SPARK-44444: Use ANSI SQL mode by default

serge rielau . com Fri, 12 Apr 2024 10:14:38 -0700

+1 it‘s the wrapping on math overflows that does it for me.

Sent from my iPhone


On Apr 12, 2024, at 9:36 AM, huaxin gao <huaxin.ga...@gmail.com> wrote:


+1

On Thu, Apr 11, 2024 at 11:18 PM L. C. Hsieh 
<vii...@gmail.com<mailto:vii...@gmail.com>> wrote:
+1

I believe ANSI mode is well developed after many releases. No doubt it
could be used.
Since it is very easy to disable it to restore to current behavior, I
guess the impact could be limited.
Do we have known the possible impacts such as what are the major
changes (e.g., what kind of queries/expressions will fail)? We can
describe them in the release note.

On Thu, Apr 11, 2024 at 10:29 PM Gengliang Wang 
<ltn...@gmail.com<mailto:ltn...@gmail.com>> wrote:
>
>
> +1, enabling Spark's ANSI SQL mode in version 4.0 will significantly enhance 
> data quality and integrity. I fully support this initiative.
>
> > In other words, the current Spark ANSI SQL implementation becomes the first 
> > implementation for Spark SQL users to face at first while providing
> `spark.sql.ansi.enabled=false` in the same way without losing any 
> capability.`spark.sql.ansi.enabled=false` in the same way without losing any 
> capability.
>
> BTW, the try_* functions and SQL Error Attribution Framework will also be 
> beneficial in migrating to ANSI SQL mode.
>
>
> Gengliang
>
>
> On Thu, Apr 11, 2024 at 7:56 PM Dongjoon Hyun 
> <dongjoon.h...@gmail.com<mailto:dongjoon.h...@gmail.com>> wrote:
>>
>> Hi, All.
>>
>> Thanks to you, we've been achieving many things and have on-going SPIPs.
>> I believe it's time to scope Apache Spark 4.0.0 (SPARK-44111) more narrowly
>> by asking your opinions about Apache Spark's ANSI SQL mode.
>>
>>     https://issues.apache.org/jira/browse/SPARK-44111
>>     Prepare Apache Spark 4.0.0
>>
>> SPARK-44444 was proposed last year (on 15/Jul/23) as the one of desirable
>> items for 4.0.0 because it's a big behavior.
>>
>>     https://issues.apache.org/jira/browse/SPARK-44444
>>     Use ANSI SQL mode by default
>>
>> Historically, spark.sql.ansi.enabled was added at Apache Spark 3.0.0 and has
>> been aiming to provide a better Spark SQL compatibility in a standard way.
>> We also have a daily CI to protect the behavior too.
>>
>>     https://github.com/apache/spark/actions/workflows/build_ansi.yml
>>
>> However, it's still behind the configuration with several known issues, e.g.,
>>
>>     SPARK-41794 Reenable ANSI mode in test_connect_column
>>     SPARK-41547 Reenable ANSI mode in test_connect_functions
>>     SPARK-46374 Array Indexing is 1-based via ANSI SQL Standard
>>
>> To be clear, we know that many DBMSes have their own implementations of
>> SQL standard and not the same. Like them, SPARK-44444 aims to enable
>> only the existing Spark's configuration, `spark.sql.ansi.enabled=true`.
>> There is nothing more than that.
>>
>> In other words, the current Spark ANSI SQL implementation becomes the first
>> implementation for Spark SQL users to face at first while providing
>> `spark.sql.ansi.enabled=false` in the same way without losing any capability.
>>
>> If we don't want this change for some reasons, we can simply exclude
>> SPARK-44444 from SPARK-44111 as a part of Apache Spark 4.0.0 preparation.
>> It's time just to make a go/no-go decision for this item for the global 
>> optimization
>> for Apache Spark 4.0.0 release. After 4.0.0, it's unlikely for us to aim
>> for this again for the next four years until 2028.
>>
>> WDYT?
>>
>> Bests,
>> Dongjoon

---------------------------------------------------------------------
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>

Re: [DISCUSS] SPARK-44444: Use ANSI SQL mode by default

Reply via email to