Jarno Rajala created SPARK-54654:
------------------------------------

             Summary: Dangerous automatic type coercion when 
spark.sql.ansi.enabled=true
                 Key: SPARK-54654
                 URL: https://issues.apache.org/jira/browse/SPARK-54654
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.5.7, 4.0.1
            Reporter: Jarno Rajala


The setting _*spark.sql.ansi.enabled=true*_ should enforce strict typing, 
disallowing any potentially unsafe implicit type conversions. Right now it 
doesn't.

As _*spark.sql.ansi.enabled=true*_ by default since Spark 4.0, I think this 
requires serious consideration and should be treated as a bug.

Consider the behaviour of following queries when 
*_spark.sql.ansi.enabled=true:_*

{{SELECT 123='123';}}
{{SELECT 123='123X';}}

The first will succeed and returns _{*}true{*}._ The second will fail with a 
hard error. The same issue exists with other operations, such as when using 
{*}_coalesce()_{*}. 

In complex setups, where data may come from various different sources and 
passed through multiple tables, it can be hard to ensure strict typing 
everywhere and numbers are likely to be passed in string typed columns 
unintentionally. Typing errors may go uncaught for a considerable time until a 
catastrophic runtime type mismatch occurs. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to