Hi, I am trying to look at for instance the following SQL query in Spark 1.1: SELECT table.key, table.value, table2.value FROM table2 JOIN table WHERE table2.key = table.key When I look at the output, I see that there are several stages, and several tasks per stage. The tasks have a TID, I do not see such a thing for a stage. I see the input split of the files and start, running and finished messages for the tasks. But what I really want to know is the following: Which map, shuffle and reduces are performed in which order/where can I see the actual executed code per task/stage. In between files/rdd's would be a bonus!
Thanks in advance, Tom -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Trying-to-make-sense-of-the-actual-executed-code-tp11594.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
