GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/15707

    [SPARK-18024][SQL] Introduce an internal commit protocol API - rebased

    ## What changes were proposed in this pull request?
    This patch introduces an internal commit protocol API that is used by the 
batch data source to do write commits. It currently has only one implementation 
that uses Hadoop MapReduce's OutputCommitter API. In the future, this commit 
API can be used to unify streaming and batch commits.
    
    ## How was this patch tested?
    Should be covered by existing write tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark SPARK-18024-2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15707.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15707
    
----
commit 8c4ae5eb7441fd5bc0b06276d5d02a2ebc6de4a0
Author: Eric Liang <e...@databricks.com>
Date:   2016-10-27T21:45:52Z

    Thu Oct 27 14:45:52 PDT 2016

commit 2484809e1735a7c3fc875f09c68c12d2cd99dd62
Author: Eric Liang <e...@databricks.com>
Date:   2016-10-28T00:53:13Z

    Thu Oct 27 17:53:13 PDT 2016

commit 4d967251ce01794f7cdab9f84b70fa5393d1d1f2
Author: Eric Liang <e...@databricks.com>
Date:   2016-10-28T00:53:30Z

    Thu Oct 27 17:53:29 PDT 2016

commit 72c4294bb401ff3795363d3c0bb436bb56844630
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T17:56:49Z

    WIP - commit API

commit 2a613516dd469bca5ed4d7b0f17f678e9e70e267
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T17:57:18Z

    Add commit protocol itself

commit 6af14b56590a0882800f62a2a2b939ee3715edbb
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T20:46:35Z

    Move output committer instantiation into MapReduceFileCommitterProtocol.

commit 6166093d511e833587d32e398338e2f47ccbcc8a
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T20:50:13Z

    Specify that implementations must be serializable.

commit 040bbba0bdbd647f963b7a61e18b69fd62565201
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T22:16:05Z

    Specify path

commit 51d0919577c71155adb7d4737e9441cede8fe97d
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T22:36:46Z

    Add documentation.

commit 2d7d373fe48d18037653c10424c8b1c978160958
Author: Reynold Xin <r...@databricks.com>
Date:   2016-10-31T22:43:54Z

    Make MapReduceFileCommitterProtocol serializable.

commit cd23d2f7bdf7a3ef9b93e77a3ae540d553398267
Author: Reynold Xin <r...@databricks.com>
Date:   2016-11-01T00:34:31Z

    Make protocol configurable.

commit 0647959cbbbaaf5fb5cfe31515c2598f99ee180f
Author: Reynold Xin <r...@databricks.com>
Date:   2016-11-01T00:58:23Z

    Merge pull request #15633 from ericl/spark-18087
    
    [SPARK-18087] [SQL] Optimize insert to not require REPAIR TABLE

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to