Re: questions about debugging a spark application
Good question! I am also new to the JVM and would appreciate some tips. On Sun, Apr 27, 2014 at 5:19 AM, wxhsdp wxh...@gmail.com wrote: Hi, all i have some questions about debug in spark: 1) when application finished, application UI is shut down, i can not see the details about the app, like shuffle size, duration time, stage information... there are not sufficient informations in the master UI. do i need to hang the application on? We added a flag to our application that causes it to sleep indefinitely at the end for exactly this reason. Admittedly the logs contain everything to reconstruct the same data. But the web UI is easier to understand at a glance. 2) how to get details about each task the executor run? like memory usage... I had success with using VisualVM on the executor to see details about its memory and CPU use. 3) since i'am not familiar with JVM. do i need to run the program step by step or hang on the program to use JVM utilities like jstack, jmap... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/questions-about-debugging-a-spark-application-tp4891.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: questions about debugging a spark application
thanks for your reply, daniel what do you mean by the logs contain everything to reconstruct the same data. ? i also use times to look into the logs, but only get a little. as i can see, it logs the flow to run the application, but there are no more details about each task, for example, see the following logs 14/04/28 16:36:16.740 INFO CoarseGrainedExecutorBackend: Got assigned task 70 14/04/28 16:36:16.740 INFO Executor: Running task ID 70 14/04/28 16:36:16.742 INFO BlockManager: Found block broadcast_0 locally 14/04/28 16:36:16.747 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 49 non-zero-bytes blocks out of 49 blocks 14/04/28 16:36:16.747 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote gets in 0 ms 14/04/28 16:36:16.821 INFO Executor: Serialized size of result for 70 is 1449738 14/04/28 16:36:16.821 INFO Executor: Sending result for 70 directly to driver 14/04/28 16:36:16.825 INFO Executor: Finished task ID 70 what do you mean by the logs contain everything to reconstruct the same data. ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/questions-about-debugging-a-spark-application-tp4891p4994.html Sent from the Apache Spark User List mailing list archive at Nabble.com.