Hi all, The default behaviour of Spark is to add a null value for casts that fail, unless ANSI SQL is enabled, SPARK-30292 <https://issues.apache.org/jira/browse/SPARK-30292>.
Whilst I understand that this is a subset of ANSI compliant behaviour, I don't understand why this feature is so coupled. Enabling ANSI also comes with other consequences that fall outside casting behaviour, and not all Spark operations are done via the SQL interface (i.e. spark.sql("") ). I can imagine it's a pretty useful feature to have something like an extra arg that would raise an exception if casting fails (e.g. *df.age.cast("int", raise=True)* ) without enabling ANSI as an option. Does anyone know why this approach was chosen/have I missed something? Would others find something like this useful? Thanks, Yeachan