[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

gengliangwang Wed, 28 Nov 2018 08:22:05 -0800

GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/23165


    [SPARK-26188][SQL] FileIndex: don't infer data types of partition columns 
if user specifies schema

    ## What changes were proposed in this pull request?
    
    This PR is to fix a regression introduced in: 
https://github.com/apache/spark/pull/21004/files#r236998030
    
    If user specifies schema, Spark don't need to infer data type for of 
partition columns, otherwise the data type might not match with the one user 
provided.
    E.g. for partition directory `p=4d`, after data type inference  the column 
value will be `4.0`.
    See https://issues.apache.org/jira/browse/SPARK-26188 for more details.
    
    ## How was this patch tested?
    
    Add unit test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark fixFileIndex

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23165.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23165
    
----
commit 2866a9e1c1a7d42e6cf53474733c6f39e812c680
Author: Gengliang Wang <gengliang.wang@...>
Date:   2018-11-28T16:11:22Z

    fix

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

Reply via email to