Hi Prabhu thanks for the response. I did the same the problem is when I get process id using jps or ps - ef I don't get user in the very first column I see number in place of user name so can't run jstack on it because of permission issue it gives something like following
7000028852 3553 9833 0 04:30 ? 00:00:00 /bin/bash blabal On Jan 12, 2016 12:02, "Prabhu Joseph" <prabhujose.ga...@gmail.com> wrote: > Umesh, > > Running task is a thread within the executor process. We need to take > stack trace for the executor process. The executor will be running in any > NodeManager machine as a container. > > YARN RM UI running jobs will have the host details where executor is > running. Login to that NodeManager machine and jps -l will list all java > processes, jstack -l <pid> will give the stack trace. > > > Thanks, > Prabhu Joseph > > On Mon, Jan 11, 2016 at 7:56 PM, Umesh Kacha <umesh.ka...@gmail.com> > wrote: > >> Hi Prabhu thanks for the response. How do I find pid of a slow running >> task. Task is running in yarn cluster node. When I try to see pid of a >> running task using my user I see some 7-8 digit number instead of user >> running process any idea why spark creates this number instead of >> displaying user >> On Jan 3, 2016 6:06 AM, "Prabhu Joseph" <prabhujose.ga...@gmail.com> >> wrote: >> >>> The attached image just has thread states, and WAITING threads need not >>> be the issue. We need to take thread stack traces and identify at which >>> area of code, threads are spending lot of time. >>> >>> Use jstack -l <pid> or kill -3 <pid>, where pid is the process id of the >>> executor process. Take jstack stack trace for every 2 seconds and total 1 >>> minute. This will help to identify the code where threads are spending lot >>> of time and then try to tune. >>> >>> Thanks, >>> Prabhu Joseph >>> >>> >>> >>> On Sat, Jan 2, 2016 at 1:28 PM, Umesh Kacha <umesh.ka...@gmail.com> >>> wrote: >>> >>>> Hi thanks I did that and I have attached thread dump images. That was >>>> the intention of my question asking for help to identify which waiting >>>> thread is culprit. >>>> >>>> Regards, >>>> Umesh >>>> >>>> On Sat, Jan 2, 2016 at 8:38 AM, Prabhu Joseph < >>>> prabhujose.ga...@gmail.com> wrote: >>>> >>>>> Take thread dump of Executor process several times in a short time >>>>> period and check what each threads are doing at different times which will >>>>> help to identify the expensive sections in user code. >>>>> >>>>> Thanks, >>>>> Prabhu Joseph >>>>> >>>>> On Sat, Jan 2, 2016 at 3:28 AM, unk1102 <umesh.ka...@gmail.com> wrote: >>>>> >>>>>> Sorry please see attached waiting thread log >>>>>> >>>>>> < >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n25851/Screen_Shot_2016-01-02_at_2.jpg >>>>>> > >>>>>> < >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n25851/Screen_Shot_2016-01-02_at_2.jpg >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-find-cause-waiting-threads-etc-of-hanging-job-for-7-hours-tp25850p25851.html >>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>>> >>> >