Github user makagonov commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20884#discussion_r176842732
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonGeneratorSuite.scala
 ---
    @@ -56,7 +56,7 @@ class JacksonGeneratorSuite extends SparkFunSuite {
         val gen = new JacksonGenerator(dataType, writer, option)
         gen.write(input)
         gen.flush()
    -    assert(writer.toString === """[{}]""")
    +    assert(writer.toString === """[{"a":null}]""")
    --- End diff --
    
    @HyukjinKwon actually, it looks like the result should be `[null]` rather 
than `[{}]`.
    Look at the following repro from spark-shell (downloaded binaries):
    ```scala
    scala> val df = sqlContext.sql(""" select array(cast(null as 
struct<k:string>)) as my_array""")
    df: org.apache.spark.sql.DataFrame = [my_array: array<struct<k:string>>]
    
    scala> df.printSchema
    root
     |-- my_array: array (nullable = false)
     |    |-- element: struct (containsNull = true)
     |    |    |-- k: string (nullable = true)
    scala> df.toJSON.collect().foreach(println)
    {"my_array":[null]}
    scala> df.select(to_json($"my_array")).collect().foreach(x => println(x(0)))
    [null]
    ```
    
    In older version of `JacksonGenerator`, we had a filter by element value, 
and if it was `null`, `gen.writeNull()` was called no matter what the type was 
([old 
implementation](https://github.com/apache/spark/blob/3258f27a881dfeb5ab8bae90c338603fa4b6f9d8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JacksonGenerator.scala#L41)).
 But currently, we're calling `gen.writeStartObject()...gen.writeEndObject()` 
no matter if the value is null.
    
    I couldn't repro this with a query, but when `StructsToJson` is called from 
this unit test, it goes through `JacksonGenerator.arrElementWriter` which has 
lines
    ```scala
    case st: StructType =>
    (arr: SpecializedGetters, i: Int) => {
      writeObject(writeFields(arr.getStruct(i, st.length), st, 
rootFieldWriters))
    }
    ```
    that makes it print json object even there is `null`.
    
    I'll look into this later and will try to find the easy workaround.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to