GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/23208

    [SPARK-25530][SQL] data source v2 API refactor (batch write)

    ## What changes were proposed in this pull request?
    
    Adjust the batch write API to match the read API refactor after 
https://github.com/apache/spark/pull/23086
    
    Basically it renames `BatchWriteSupportProvider` to `SupportsBatchWrite`, 
and make it extend `Table`. It also cleans up some code as batch API is 
completed.
    
    This PR also removes the test from 
https://github.com/apache/spark/pull/22688 . Now data source must return a 
table for read/write. It's a little awkward to use it with the `SaveMode` based 
write APIs, as users can append data to a non-existing table. `TableProvider` 
needs to return a `Table` instance with empty schema if the table doesn't 
exist, so that we can write it later. Hopefully we can remove the `SaveMode` 
based write APIs after the new APIs are finished and widely used.
    
    A few notes about future changes:
    1. We will create `SupportsStreamingWrite` later for streaming APIs
    2. We will create `SupportsBatchReplaceWhere`, `SupportsBatchAppend`, etc. 
for the new end-user write APIs. I think streaming APIs would remain to use 
`OutputMode`, and new end-user write APIs will apply to batch only, at least in 
the near future.
    
    
    ## How was this patch tested?
    
    existing tests


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark refactor-batch

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23208.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23208
    
----
commit 00fc34fa793b922a48a4bf8e9f9cd0e3b688800b
Author: Wenchen Fan <wenchen@...>
Date:   2018-12-03T14:38:43Z

    data source v2 API refactor (batch write)

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to