GitHub user MaxGekk opened a pull request:

    https://github.com/apache/spark/pull/22528

    [SPARK-25513][SQL] Read zipped CSV and JSON

    ## What changes were proposed in this pull request?
    
    In the PR, I propose to support reading of zip archives containing **one** 
CSV or JSON file in the multi-line mode. 
    
    ## How was this patch tested?
    
    Added tests for CSV and JSON where zip archives are created by Java library.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MaxGekk/spark-1 read-zipped-csv-json

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22528.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22528
    
----
commit a926d277e0cecb4d2d66e6500a68e656da6e1d2f
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-22T19:49:44Z

    Support zip archives

commit 29716248b1ef504ab828c6b8af8ac78f1013923a
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-22T19:49:59Z

    Add test for zipped CSV files

commit 149e452d17cffecb024c29771dc05322295ba437
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-22T19:52:18Z

    Fix imports

commit 1dff39eb7e06435551ab7ba0d0443b106e60e4b6
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-22T19:57:10Z

    Added a test for zipped JSON

commit 09dff81b34600c05a3b30a135c32e9dcd40e5bae
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-22T19:58:56Z

    Refactoring of the CSV test

commit 5fda51a3505437c4a32f146940a908cd1557bbf5
Author: Maxim Gekk <maxim.gekk@...>
Date:   2018-09-22T20:02:37Z

    Make extension checking case agnostic

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to