[GitHub] spark pull request #21371: [SPARK-24250][SQL][FollowUp] Fix compile error an...

viirya Sun, 20 May 2018 01:28:01 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21371#discussion_r189454251
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonInferSchema.scala
 ---
    @@ -66,8 +69,12 @@ private[sql] object JsonInferSchema {
                     s"Parse Mode: ${FailFastMode.name}.", e)
               }
             }
    -      }
    -    }.fold(StructType(Nil))(
    +      }.fold(StructType(Nil))(
    +        compatibleRootType(columnNameOfCorruptRecord, parseMode))
    +      Iterator(typeInPartition)
    +    }.collect()
    --- End diff --
    
    > good catch! but wondering how the test passed in my PR...
    
    It is somehow flaky. If all types are folded at executor sides, when they 
are going to fold at local, it just merging `StructType()` and 
`StructType(StructField("id"), StructField("ID"))`. So you can still get 
current schema back.
    
    But if unfortunately, you have one partition with only `id` column, you 
need to merge `StructType(StructField("id"))` and 
`StructType(StructField("ID")` in local. Then the problem will happen.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21371: [SPARK-24250][SQL][FollowUp] Fix compile error an...

Reply via email to