Github user makagonov commented on a diff in the pull request:
https://github.com/apache/spark/pull/20884#discussion_r176842732
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonGeneratorSuite.scala
---
@@ -56,7 +56,7 @@ class JacksonGeneratorSuite extends SparkFunSuite {
val gen = new JacksonGenerator(dataType, writer, option)
gen.write(input)
gen.flush()
-assert(writer.toString === """[{}]""")
+assert(writer.toString === """[{"a":null}]""")
--- End diff --
@HyukjinKwon actually, it looks like the result should be `[null]` rather
than `[{}]`.
Look at the following repro from spark-shell (downloaded binaries):
```scala
scala> val df = sqlContext.sql(""" select array(cast(null as
struct)) as my_array""")
df: org.apache.spark.sql.DataFrame = [my_array: array>]
scala> df.printSchema
root
|-- my_array: array (nullable = false)
||-- element: struct (containsNull = true)
|||-- k: string (nullable = true)
scala> df.toJSON.collect().foreach(println)
{"my_array":[null]}
scala> df.select(to_json($"my_array")).collect().foreach(x => println(x(0)))
[null]
```
In older version of `JacksonGenerator`, we had a filter by element value,
and if it was `null`, `gen.writeNull()` was called no matter what the type was
([old
implementation](https://github.com/apache/spark/blob/3258f27a881dfeb5ab8bae90c338603fa4b6f9d8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonGenerator.scala#L41)).
But currently, we're calling `gen.writeStartObject()...gen.writeEndObject()`
no matter if the value is null.
I couldn't repro this with a query, but when `StructsToJson` is called from
this unit test, it goes through `JacksonGenerator.arrElementWriter` which has
lines
```scala
case st: StructType =>
(arr: SpecializedGetters, i: Int) => {
writeObject(writeFields(arr.getStruct(i, st.length), st,
rootFieldWriters))
}
```
that makes it print json object even there is `null`.
I'll look into this later and will try to find the easy workaround.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org