[ https://issues.apache.org/jira/browse/SPARK-32501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168129#comment-17168129 ]
Apache Spark commented on SPARK-32501: -------------------------------------- User 'MaxGekk' has created a pull request for this issue: https://github.com/apache/spark/pull/29311 > Inconsistent NULL conversions to strings > ----------------------------------------- > > Key: SPARK-32501 > URL: https://issues.apache.org/jira/browse/SPARK-32501 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.0 > Reporter: Maxim Gekk > Priority: Major > > 1. It is impossible to distinguish empty string and null, for instance: > {code:scala} > scala> Seq(Seq(""), Seq(null)).toDF().show > +-----+ > |value| > +-----+ > | []| > | []| > +-----+ > {code} > 2. Inconsistent NULL conversions for top-level values and nested columns, for > instance: > {code:scala} > scala> sql("select named_struct('c', null), null").show > +---------------------+----+ > |named_struct(c, NULL)|NULL| > +---------------------+----+ > | []|null| > +---------------------+----+ > {code} > 3. `.show()` is different from conversions to Hive strings, and as a > consequence its output is different from `spark-sql` (sql tests): > {code:sql} > spark-sql> select named_struct('c', null) as struct; > {"c":null} > {code} > {code:scala} > scala> sql("select named_struct('c', null) as struct").show > +------+ > |struct| > +------+ > | []| > +------+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org