GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22183
[SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when reading from Parquet ## What changes were proposed in this pull request? This is a backport of https://github.com/apache/spark/pull/22148 Spark SQL returns NULL for a column whose Hive metastore schema and Parquet schema are in different letter cases, regardless of spark.sql.caseSensitive set to true or false. This PR aims to add case-insensitive field resolution for ParquetFileFormat. * Do case-insensitive resolution only if Spark is in case-insensitive mode. * Field resolution should fail if there is ambiguity, i.e. more than one field is matched. ## How was this patch tested? Unit tests added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/seancxmao/spark SPARK-25132-2.3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22183.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22183 ---- commit 28315888eaae5a9c9160ea53eb6eb9a9af712958 Author: seancxmao <seancxmao@...> Date: 2018-08-21T02:34:23Z [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when reading from Parquet ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org