Aaron Staple created SPARK-2581:
-----------------------------------

             Summary: complete or withdraw visitedStages optimization in 
DAGScheduler’s stageDependsOn
                 Key: SPARK-2581
                 URL: https://issues.apache.org/jira/browse/SPARK-2581
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
            Reporter: Aaron Staple
            Priority: Minor


Right now the visitedStages HashSet is populated with stages, but never queried 
to limit examination of previously visited stages.  It may make sense to check 
whether a mapStage has been visited previously before visiting it again, as in 
the nearby visitedRdds check.  Or it may be that the existing visitedRdds check 
sufficiently optimizes this function, and visitedStages can simply be removed.

See discussion here: 
https://github.com/apache/spark/pull/1362#discussion-diff-15018046L1107



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to