GitHub user carsonwang opened a pull request:

    https://github.com/apache/spark/pull/19877

    [SPARK-22681]Accumulator should only updated once for each task in result 
stage

    ## What changes were proposed in this pull request?
    As the doc says "For accumulator updates performed inside actions only, 
Spark guarantees that each task’s update to the accumulator will only be 
applied once, i.e. restarted tasks will not update the value."
    But currently the code doesn't guarantee this.
    
    ## How was this patch tested?
    New added tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/carsonwang/spark fixAccum

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19877.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19877
    
----
commit 882126c2671e1733d572350af9749e9f8bdca1c2
Author: Carson Wang <carson.w...@intel.com>
Date:   2017-12-04T12:23:14Z

    Do not update accumulator for resubmitted task in result stage

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to