GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/23165
[SPARK-26188][SQL] FileIndex: don't infer data types of partition columns if user specifies schema ## What changes were proposed in this pull request? This PR is to fix a regression introduced in: https://github.com/apache/spark/pull/21004/files#r236998030 If user specifies schema, Spark don't need to infer data type for of partition columns, otherwise the data type might not match with the one user provided. E.g. for partition directory `p=4d`, after data type inference the column value will be `4.0`. See https://issues.apache.org/jira/browse/SPARK-26188 for more details. ## How was this patch tested? Add unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gengliangwang/spark fixFileIndex Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23165.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23165 ---- commit 2866a9e1c1a7d42e6cf53474733c6f39e812c680 Author: Gengliang Wang <gengliang.wang@...> Date: 2018-11-28T16:11:22Z fix ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org