[ 
https://issues.apache.org/jira/browse/SPARK-19538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Ousterhout updated SPARK-19538:
-----------------------------------
    Priority: Minor  (was: Major)

> DAGScheduler and TaskSetManager can have an inconsistent view of whether a 
> stage is complete.
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19538
>                 URL: https://issues.apache.org/jira/browse/SPARK-19538
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.1.0
>            Reporter: Kay Ousterhout
>            Assignee: Kay Ousterhout
>            Priority: Minor
>
> The pendingPartitions in Stage tracks partitions that still need to be 
> computed, and is used by the DAGScheduler to determine when to mark the stage 
> as complete.  In most cases, this variable is exactly consistent with the 
> tasks in the TaskSetManager (for the current version of the stage) that are 
> still pending.  However, as discussed in SPARK-19263, these can become 
> inconsistent when an ShuffleMapTask for an earlier attempt of the stage 
> completes, in which case the DAGScheduler may think the stage has finished, 
> while the TaskSetManager is still waiting for some tasks to complete (see the 
> description in this pull request: 
> https://github.com/apache/spark/pull/16620).  This leads to bugs like 
> SPARK-19263.  Another problem with this behavior is that listeners can get 
> two StageCompleted messages: once when the DAGScheduler thinks the stage is 
> complete, and a second when the TaskSetManager later decides the stage is 
> complete.  We should fix this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to