[GitHub] incubator-spark pull request: SPARK-1124: Fix infinite retries of ...

markhamstra Mon, 24 Feb 2014 08:17:08 -0800

Github user markhamstra commented on a diff in the pull request:

    https://github.com/apache/incubator-spark/pull/641#discussion_r9997152
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
    @@ -373,25 +375,26 @@ class DAGScheduler(
               } else {
                 def removeStage(stageId: Int) {
                   // data structures based on Stage
    -              stageIdToStage.get(stageId).foreach { s =>
    -                if (running.contains(s)) {
    +              for (stage <- stageIdToStage.get(stageId)) {
    +                if (running.contains(stage)) {
                       logDebug("Removing running stage %d".format(stageId))
    -                  running -= s
    +                  running -= stage
    +                }
    +                stageToInfos -= stage
    +                for (shuffleDep <- stage.shuffleDep) {
    --- End diff --
    
    At this point, the need is to clean up the DAGScheduler's data structures 
that reference the removed stage.  If you are 100% certain that the stage's 
notion of shuffleDeps contains every shuffleId that shuffleToMapStage is 
tracking for the stage, then you can take the simpler route that you have of 
working from the stage's understanding.  I wasn't 100% certain of that (which 
is not the same thing as saying that I have a good reason to believe that the 
stage's understanding of shuffleDeps will diverge from shuffleToMapStage's 
understanding), so I took the safer route of working from what 
shuffleToMapStage's knows instead of from what the stage knows.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-spark pull request: SPARK-1124: Fix infinite retries of ...

Reply via email to