[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589925#comment-14589925
 ] 

Hari Sekhon commented on MAPREDUCE-6401:
----------------------------------------

Actually the task logs showed the same thing, not much to go on:
{code}
Exception from container-launch. Container id: 
container_e199_1434474871820_0001_02_000019 Exit code: 7 Stack trace: 
ExitCodeException exitCode=7:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:293)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745) Shell output: main : command 
provided 1 main : user is <custom_scrubbed> main : requested yarn user is 
<custom_scrubbed> Container exited with a non-zero exit code 7
{code}
but the full tasks logs don't seem to have been retained by the history server. 
This made me suspicious so I reset the logging locations to try to get my hands 
on the full logs and after a yarn restart jobs started working normally again 
without failed tasks/container launches. Although I'm very certain that the 
cluster used to log to that dir I reset it to, perhaps Ambari had a bug that 
lost the location and reset to debug locations that didn't work properly (it 
wouldn't be the first time, eg. AMBARI-9022)

I think we should leave this as a minor todo to improve debugging information, 
especially when launching shell commands and encountering non-zero exit codes, 
logging is king.

> Container-launch failure gives no debugging output
> --------------------------------------------------
>
>                 Key: MAPREDUCE-6401
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6401
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.6.0
>         Environment: HDP 2.2
>            Reporter: Hari Sekhon
>         Attachments: job.log
>
>
> MR jobs are failing on my cluster with Stack trace: ExitCodeException 
> exitCode=7 but little else in terms of debugging information. Can we please 
> improve the debugging info? Log file is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to