[ 
https://issues.apache.org/jira/browse/SPARK-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Or resolved SPARK-14468.
-------------------------------
          Resolution: Fixed
       Fix Version/s: 1.5.2
                      2.0.0
                      1.6.2
                      1.4.2
    Target Version/s: 1.5.2, 1.4.2, 1.6.2, 2.0.0  (was: 1.4.2, 1.5.2, 1.6.2, 
2.0.0)

> Always enable OutputCommitCoordinator
> -------------------------------------
>
>                 Key: SPARK-14468
>                 URL: https://issues.apache.org/jira/browse/SPARK-14468
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Andrew Or
>            Assignee: Andrew Or
>             Fix For: 1.4.2, 1.6.2, 2.0.0, 1.5.2
>
>
> The OutputCommitCoordinator was originally introduced in SPARK-4879 because 
> speculation causes the output of some partitions to be deleted. However, as 
> we can see in SPARK-10063, speculation is not the only case where this can 
> happen.
> More specifically, when we retry a stage we're not guaranteed to kill the 
> tasks that are still running (we don't even interrupt their threads), so we 
> may end up with multiple concurrent task attempts for the same task. This 
> leads to problems like SPARK-8029, but this fix alone is necessary but not 
> sufficient.
> In general, when we run into situations like these, we need the 
> OutputCommitCoordinator because we don't control what the underlying file 
> system does. Enabling this doesn't induce heavy performance costs so there's 
> little reason why we shouldn't always enable it to ensure correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to