GitHub user seancxmao opened a pull request:

    https://github.com/apache/spark/pull/22183

    [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when 
reading from Parquet

    ## What changes were proposed in this pull request?
    This is a backport of https://github.com/apache/spark/pull/22148
    
    Spark SQL returns NULL for a column whose Hive metastore schema and Parquet 
schema are in different letter cases, regardless of spark.sql.caseSensitive set 
to true or false. This PR aims to add case-insensitive field resolution for 
ParquetFileFormat.
    * Do case-insensitive resolution only if Spark is in case-insensitive mode.
    * Field resolution should fail if there is ambiguity, i.e. more than one 
field is matched.
    
    ## How was this patch tested?
    Unit tests added.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/seancxmao/spark SPARK-25132-2.3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22183.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22183
    
----
commit 28315888eaae5a9c9160ea53eb6eb9a9af712958
Author: seancxmao <seancxmao@...>
Date:   2018-08-21T02:34:23Z

    [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when 
reading from Parquet

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to