Re: questions about debugging a spark application

2014-04-28 Thread Daniel Darabos
Good question! I am also new to the JVM and would appreciate some tips.

On Sun, Apr 27, 2014 at 5:19 AM, wxhsdp wxh...@gmail.com wrote:

 Hi, all
   i have some questions about debug in spark:
   1) when application finished, application UI is shut down, i can not see
 the details about the app, like
   shuffle size, duration time, stage information... there are not
 sufficient informations in the master UI.
  do i need to hang the application on?


We added a flag to our application that causes it to sleep indefinitely at
the end for exactly this reason. Admittedly the logs contain everything to
reconstruct the same data. But the web UI is easier to understand at a
glance.

  2) how to get details about each task the executor run? like memory
 usage...


I had success with using VisualVM on the executor to see details about its
memory and CPU use.

  3) since i'am not familiar with JVM. do i need to run the program step by
 step or hang on the program
   to use JVM utilities like jstack, jmap...



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/questions-about-debugging-a-spark-application-tp4891.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.



Re: questions about debugging a spark application

2014-04-28 Thread wxhsdp
thanks for your reply, daniel
what do you mean by the logs contain everything to reconstruct the same
data. ?

i also use times to look into the logs, but only get a little. 
as i can see, it logs the flow to run the application, but there are no more
details about
each task, for example, see the following logs

14/04/28 16:36:16.740 INFO CoarseGrainedExecutorBackend: Got assigned task
70
14/04/28 16:36:16.740 INFO Executor: Running task ID 70
14/04/28 16:36:16.742 INFO BlockManager: Found block broadcast_0 locally
14/04/28 16:36:16.747 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
Getting 49 non-zero-bytes blocks out of 49 blocks
14/04/28 16:36:16.747 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
Started 0 remote gets in  0 ms
14/04/28 16:36:16.821 INFO Executor: Serialized size of result for 70 is
1449738
14/04/28 16:36:16.821 INFO Executor: Sending result for 70 directly to
driver
14/04/28 16:36:16.825 INFO Executor: Finished task ID 70

what do you mean by the logs contain everything to reconstruct the same
data. ?




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/questions-about-debugging-a-spark-application-tp4891p4994.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.