GitHub user hthuynh2 opened a pull request:

    https://github.com/apache/spark/pull/21729

    SPARK-24755 Executor loss can cause task to not be resubmitted

    **Description**
    As described in 
[SPARK-24755](https://issues.apache.org/jira/browse/SPARK-24755), when 
speculation is enabled, there is scenario that executor loss can cause task to 
not be resubmitted. 
    This patch changes the variable killedByOtherAttempt to keeps track of the 
taskId of tasks that are killed by other attempt. By doing this, we can still 
prevent resubmitting task killed by other attempt while resubmit successful 
attempt when executor lost.
    
    **How was this patch tested?**
    A UT is added based on the UT written by @xuanyuanking with modification to 
simulate the scenario described in SPARK-24755. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hthuynh2/spark SPARK_24755

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21729.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21729
    
----
commit 093e39cf76378821284ef7d771e819afb69930ae
Author: Hieu Huynh <“hieu.huynh@...>
Date:   2018-07-08T18:20:26Z

    SPARK-24755 Executor loss can cause task to not be resubmitted

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to