Andrew Or created SPARK-14468:
---------------------------------

             Summary: Always enable OutputCommitCoordinator
                 Key: SPARK-14468
                 URL: https://issues.apache.org/jira/browse/SPARK-14468
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Andrew Or
            Assignee: Andrew Or


The OutputCommitCoordinator was originally introduced in SPARK-4879 because 
speculation causes the output of some partitions to be deleted. However, as we 
can see in SPARK-10063, speculation is not the only case where this can happen.

More specifically, when we retry a stage we're not guaranteed to kill the tasks 
that are still running (we don't even interrupt their threads), so we may end 
up with multiple concurrent task attempts for the same task. This leads to 
problems like SPARK-8029, but this fix alone is necessary but not sufficient.

In general, when we run into situations like these, we need the 
OutputCommitCoordinator because we don't control what the underlying file 
system does. Enabling this doesn't induce heavy performance costs so there's 
little reason why we shouldn't always enable it to ensure correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to