[ https://issues.apache.org/jira/browse/YARN-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472777#comment-16472777 ]
Jason Lowe commented on YARN-8274: ---------------------------------- bq. It would be nice if the code was refactored to add docker_binary in construct_docker_command to avoid duplicated add_to_args for docker_binary for all get_docker_*_command, but the priority is to get a good stable state for release. I was thinking the exact same thing as I was writing the patch. I went for the simple approach to keep the patch small and easy to review since it's a bugfix. I filed YARN-8284 to track that. > Docker command error during container relaunch > ---------------------------------------------- > > Key: YARN-8274 > URL: https://issues.apache.org/jira/browse/YARN-8274 > Project: Hadoop YARN > Issue Type: Task > Reporter: Billie Rinaldi > Assignee: Jason Lowe > Priority: Critical > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-8274.001.patch, YARN-8274.002.patch > > > I initiated container relaunch with a "sleep 60; exit 1" launch command and > saw a "not a docker command" error on relaunch. Haven't figured out why this > is happening, but it seems like it has been introduced recently to > trunk/branch-3.1. cc [~shaneku...@gmail.com] [~ebadger] > {noformat} > org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: > Relaunch container failed > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.relaunchContainer(DockerLinuxContainerRuntime.java:954) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.relaunchContainer(DelegatingLinuxContainerRuntime.java:150) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.relaunchContainer(LinuxContainerExecutor.java:486) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.relaunchContainer(ContainerLaunch.java:504) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerRelaunch.call(ContainerRelaunch.java:111) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerRelaunch.call(ContainerRelaunch.java:47) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2018-05-09 21:41:46,631 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from > container-launch. > 2018-05-09 21:41:46,631 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: > container_1525897486447_0003_01_000002 > 2018-05-09 21:41:46,631 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 7 > 2018-05-09 21:41:46,631 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception > message: Relaunch container failed > 2018-05-09 21:41:46,631 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Shell error > output: docker: 'container_1525897486447_0003_01_000002' is not a docker > command. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org