GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/23253

    [SPARK-26303][SQL] Return partial results for bad JSON records

    ## What changes were proposed in this pull request?
    
    In the PR, I propose to return partial results from JSON datasource and 
JSON functions in the PERMISSIVE mode if some of JSON fields are parsed and 
converted to desired types successfully. The changes are made only for 
`StructType`. Whole bad JSON records are placed into the corrupt column 
specified by the `columnNameOfCorruptRecord` option or SQL config.
    
    Partial results are not returned for malformed JSON input. 
    
    ## How was this patch tested?
    
    Added new UT which checks converting JSON strings with one invalid and one 
valid field at the end of the string.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 json-bad-record

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23253.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23253
    
----
commit 2d2fed923e9567b6bc22cf4355ee3b57f8699cd4
Author: Maxim Gekk <max.gekk@...>
Date:   2018-12-06T21:26:42Z

    Test for bad records

commit de4885899b2a5e866b6c4a365e91fc901cdc0f86
Author: Maxim Gekk <max.gekk@...>
Date:   2018-12-06T23:11:29Z

    Added a test

commit 101df4e7fcee60c042c21951551bd3376ae325cb
Author: Maxim Gekk <max.gekk@...>
Date:   2018-12-06T23:16:09Z

    Return partial results

commit 542ae74a9240b2905f4ca1f9de62fd53f1562d64
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-12-07T09:02:37Z

    Updating the migration guide

commit 439f57ac59a7c1f481317126934a55d20b38f579
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-12-07T09:04:58Z

    Revert "Updating the migration guide"
    
    This reverts commit 542ae74a9240b2905f4ca1f9de62fd53f1562d64.

commit 77d767016607056809d5f8c4e9653809838b91dd
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-12-07T09:10:31Z

    Updating the migration guide - separate notes

commit fa431ee293d5ae3a7c3b8b441c4a0b6778ac84e1
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-12-07T09:12:16Z

    Updating the migration guide - minor

commit e13673de2537fb42bdc2bb8df9849560ae399c89
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-12-07T10:00:07Z

    Updating the migration guide - minor 2

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to