[ https://issues.apache.org/jira/browse/SPARK-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14228919#comment-14228919 ]
Apache Spark commented on SPARK-4654: ------------------------------------- User 'JoshRosen' has created a pull request for this issue: https://github.com/apache/spark/pull/3515 > Clean up DAGScheduler's getMissingParentStages() and stageDependsOn() methods > ----------------------------------------------------------------------------- > > Key: SPARK-4654 > URL: https://issues.apache.org/jira/browse/SPARK-4654 > Project: Spark > Issue Type: Sub-task > Components: Spark Core > Reporter: Josh Rosen > Assignee: Josh Rosen > > DAGScheduler has {{getMissingParentStages()}} and {{stageDependsOn()}} > methods, which are suspiciously similar to {{getParentStages()}}. All of > these methods perform traversal of the RDD / Stage graph to inspect parent > stages. We can remove both of these methods, though: the set of parent > stages is known when a {{Stage}} instance is constructed and is already > stored in {{Stage.parents}}, so we can just check for missing stages by > looking for unavailable stages in {{Stage.parents}}. Similarly, we can > determine whether one stage depends on another by searching {{Stage.parents}} > rather than performing the entire graph traversal from scratch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org