[ https://issues.apache.org/jira/browse/YARN-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Yang updated YARN-8194: ---------------------------- Fix Version/s: (was: 3.1.1) 3.2.0 > Exception when reinitializing a container using LinuxContainerExecutor > ---------------------------------------------------------------------- > > Key: YARN-8194 > URL: https://issues.apache.org/jira/browse/YARN-8194 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Chandni Singh > Assignee: Chandni Singh > Priority: Blocker > Fix For: 3.2.0 > > Attachments: YARN-8194.001.patch > > > When a component instance is upgraded and the container executor is set to > {{org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor}}, then > the following exception is seen in the nodemanager: > {code} > Writing to cgroup task files... > Creating local dirs... > Can't open > /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh > for output - File exists > Getting exit code file... > Creating script paths... > Full command array for failed execution: > [/usr/local/hadoop-3.2.0-SNAPSHOT/bin/container-executor, hbase, hbase, 1, > application_1524242413029_0001, container_1524242413029_0001_01_000002, > /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002, > > /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh, > > /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1524242413029_0001/container_1524242413029_0001_01_000002/container_1524242413029_0001_01_000002.tokens, > > /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1524242413029_0001/container_1524242413029_0001_01_000002/container_1524242413029_0001_01_000002.pid, > /tmp/hadoop-yarn/nm-local-dir, > /usr/local/hadoop-3.2.0-SNAPSHOT/logs/userlogs, cgroups=none] > 2018-04-20 16:50:16,641 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: > Launch container failed. Exception: > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=33: Could not create copy file 3 > /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh > Could not create local files and directories > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:118) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:141) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:477) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:492) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:304) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:101) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: ExitCodeException exitCode=33: Could not create copy file 3 > /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh > Could not create local files and directories > at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009) > at org.apache.hadoop.util.Shell.run(Shell.java:902) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152) > ... 11 more > 2018-04-20 16:50:16,642 WARN > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code > from container container_1524242413029_0001_01_000002 is : 33 > 2018-04-20 16:50:16,642 WARN > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exception > from container-launch with container ID: > container_1524242413029_0001_01_000002 and exit code: 33 > org.apache.hadoop.yarn.server.nodemanager.containermanager.runtime.ContainerExecutionException: > Launch container failed > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:124) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:141) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.handleLaunchForLaunchType(LinuxContainerExecutor.java:562) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:477) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:492) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:304) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:101) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2018-04-20 16:50:16,643 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from > container-launch. > 2018-04-20 16:50:16,643 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: > container_1524242413029_0001_01_000002 > 2018-04-20 16:50:16,643 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 33 > 2018-04-20 16:50:16,643 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception > message: Launch container failed > 2018-04-20 16:50:16,643 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Shell error > output: Could not create copy file 3 > /tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1524242413029_0001/container_1524242413029_0001_01_000002/launch_container.sh > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org