[ https://issues.apache.org/jira/browse/SPARK-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338241#comment-14338241 ]
Sean Owen commented on SPARK-1015: ---------------------------------- [~zjffdu] are you planning on working on this? We also have {{toDebugString}} which prints some of this info. How would the visualization work with spark-shell? Is this just a utility you can host outside Spark? > Visualize the DAG of RDD > ------------------------- > > Key: SPARK-1015 > URL: https://issues.apache.org/jira/browse/SPARK-1015 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 0.9.0 > Reporter: Jeff Zhang > > The DAG of RDD can help user understand the data flow and how spark get the > final RDD executed. It could help user to find chances to optimize the > execution of some complex RDD. I will leverage graphviz to visualize the > DAG. > For this task, I plan to split it into 2 steps. > Step 1. Just visualize the simple DAG graph. Each RDD is one node, and > there will be one edge between the parent RDD and child RDD. ( I attach one > simple graph in the attachments ) > Step 2. Put RDD in the same stage into one sub graph. This may need to > extract the splitting staging related code in DAGSchduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org