Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22237 Hi @MaxGekk , I just reviewed this PR. I noticed that there is one behavior change. The column value of `from_json(corrupt_record...)` become `Row(null, nulll, ...)`, instead of `null`. ``` val df = Seq("""{"a" 1, "b": 2}""").toDS() val schema = new StructType().add("a", IntegerType).add("b", IntegerType) ``` Before the code change: ``` scala> df.select(from_json($"value", schema).as("col")).where("col is null").show() +----+ | col| +----+ |null| +----+ scala> df.select(from_json($"value", schema).as("col")).where("col.a is null").show() +----+ | col| +----+ |null| +----+ ``` After the code change: ``` scala> df.select(from_json($"value", schema).as("col")).where("col is null").show() +---+ |col| +---+ +---+ scala> df.select(from_json($"value", schema).as("col")).where("col.a is null").show() +---+ |col| +---+ |[,]| +---+ ``` The main difference is that we can't filter the null `col` in the result column. Is there any reason for changing this?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org