[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated MAPREDUCE-5044:
----------------------------------
    Attachment: MAPREDUCE-5044.012.patch

The new test case ({{TestMRJobs#testThreadDumpOnTaskTimeout}}) when run with 
{{TestUberAM}}, detected that timeout did not cause a thread dump within an 
uber AM. So, I added code in {{LocalContainerLauncher}} in the latest patch 
({{MAPREDUCE-5044.012.patch}}) to handle the timeout event.

Instead of having the uber AM connect to the NM which would then send the QUIT 
signal back to the uber AM, I chose to dump the stack directly from the uber 
AM. I chose to use {{ThreadMXBean#dumpAllThreads}} even though there was 
already a Hadoop {{ReflwctionUtils#printThreadInfo}} method which would create 
a dump. The reason is because the output of {{ThreadMXBean#dumpAllThreads}} 
much more closely resembles the standard thread stack dump than does the output 
of {{ReflwctionUtils#printThreadInfo}}.

> Have AM trigger jstack on task attempts that timeout before killing them
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5044
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Assignee: Eric Payne
>         Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, 
> MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, 
> MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, 
> MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, 
> MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, 
> Screen Shot 2013-11-12 at 1.06.04 PM.png
>
>
> When an AM expires a task attempt it would be nice if it triggered a jstack 
> output via SIGQUIT before killing the task attempt.  This would be invaluable 
> for helping users debug their hung tasks, especially if they do not have 
> shell access to the nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to