[ https://issues.apache.org/jira/browse/SPARK-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339690#comment-14339690 ]
Jeff Zhang edited comment on SPARK-1015 at 2/27/15 4:03 AM: ------------------------------------------------------------ [~sowen] I may not have time for this recently. bq. How would the visualization work with spark-shell? Is this just a utility you can host outside Spark? I would prefer to use graphviz for visualize the RDD. And spark just build the dot file for graphviz and let the graphviz to visualize it. Besides, I think integrating the DAG view to spark ui may be helpful for users to debug the RDD (especially on performance perspective ) was (Author: zjffdu): [~sowen] I may not have time for this recently. bq. How would the visualization work with spark-shell? Is this just a utility you can host outside Spark? I would prefer to use graphviz for visualize the RDD. And spark just build the dot file for graphviz and let the graphviz to visualize it. > Visualize the DAG of RDD > ------------------------- > > Key: SPARK-1015 > URL: https://issues.apache.org/jira/browse/SPARK-1015 > Project: Spark > Issue Type: New Feature > Components: Spark Core > Affects Versions: 0.9.0 > Reporter: Jeff Zhang > > The DAG of RDD can help user understand the data flow and how spark get the > final RDD executed. It could help user to find chances to optimize the > execution of some complex RDD. I will leverage graphviz to visualize the > DAG. > For this task, I plan to split it into 2 steps. > Step 1. Just visualize the simple DAG graph. Each RDD is one node, and > there will be one edge between the parent RDD and child RDD. ( I attach one > simple graph in the attachments ) > Step 2. Put RDD in the same stage into one sub graph. This may need to > extract the splitting staging related code in DAGSchduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org