Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21371#discussion_r189454251 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonInferSchema.scala --- @@ -66,8 +69,12 @@ private[sql] object JsonInferSchema { s"Parse Mode: ${FailFastMode.name}.", e) } } - } - }.fold(StructType(Nil))( + }.fold(StructType(Nil))( + compatibleRootType(columnNameOfCorruptRecord, parseMode)) + Iterator(typeInPartition) + }.collect() --- End diff -- > good catch! but wondering how the test passed in my PR... It is somehow flaky. If all types are folded at executor sides, when they are going to fold at local, it just merging `StructType()` and `StructType(StructField("id"), StructField("ID"))`. So you can still get current schema back. But if unfortunately, you have one partition with only `id` column, you need to merge `StructType(StructField("id"))` and `StructType(StructField("ID")` in local. Then the problem will happen.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org