[ https://issues.apache.org/jira/browse/YARN-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374436#comment-15374436 ]
Zhankun Tang commented on YARN-5360: ------------------------------------ Current LCE (DockerLinuxContainerRuntime) is mounting /etc/passwd to the container. But this approach does not work per my testing. And this approach is also invasive and can lead to user confusion and frustration. Here I post the error log in my single node cluster: {panel} 2016-07-13 21:55:34,870 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 7. Privileged Execution Operation Output: main : command provided 4 main : run as user is yarn main : requested yarn user is yarn Creating script paths... Creating local dirs... Getting exit code file... Changing effective user to root... Launching docker container... Full command array for failed execution: /home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/bin/container-executor, yarn, yarn, 4, application_1468316940186_0004, container_1468316940186_0004_01_000002, /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1468316940186_0004/container_1468316940186_0004_01_000002/launch_container.sh, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1468316940186_0004/container_1468316940186_0004_01_000002/container_1468316940186_0004_01_000002.tokens, /tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1468316940186_0004/container_1468316940186_0004_01_000002/container_1468316940186_0004_01_000002.pid, /tmp/hadoop-yarn/nm-local-dir, /home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/logs/userlogs, /tmp/hadoop-yarn/nm-docker-cmds/docker.container_1468316940186_0004_01_0000022899997369798306591.cmd, cgroups=none 2016-07-13 21:55:34,870 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime: Launch container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=7: docker: Error response from daemon: Cannot start container c75c395c6ed228144a383e9030d43915e263c7ca6d512e4a7cd25d5fbeffae0a: 9 System error: Unable to find user yarn. Could not invoke docker /usr/bin/docker run --name=container_1468316940186_0004_01_000002 --user=yarn -d --workdir=/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002 --net=host --cap-drop=ALL --cap-add=AUDIT_WRITE --cap-add=NET_RAW --cap-add=SETGID --cap-add=SETUID --cap-add=NET_BIND_SERVICE --cap-add=SETFCAP --cap-add=FSETID --cap-add=SETPCAP --cap-add=SYS_CHROOT --cap-add=CHOWN --cap-add=FOWNER --cap-add=MKNOD --cap-add=KILL --cap-add=DAC_OVERRIDE -v /etc/passwd:/etc/password:ro -v /tmp/hadoop-yarn/nm-local-dir:/tmp/hadoop-yarn/nm-local-dir -v /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002:/tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002 -v /home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/logs/userlogs:/home/yarn/code/apache_hadoop/hadoop/hadoop-dist/target/hadoop-2.8.0-SNAPSHOT/logs/userlogs centos bash /tmp/hadoop-yarn/nm-local-dir/usercache/yarn/appcache/application_1468316940186_0004/container_1468316940186_0004_01_000002/launch_container.sh. {panel} I copied the command and add options which mount the "/etc/group", "/etc/shadow". Still not working. > Use UID instead of user name to build the Docker run command > ------------------------------------------------------------ > > Key: YARN-5360 > URL: https://issues.apache.org/jira/browse/YARN-5360 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn > Reporter: Zhankun Tang > Assignee: Zhankun Tang > > There is *a dependency between job submitting user and the user in the Docker > image* in LCE currently. For instance, in order to run the Docker container > as yarn user, we can choose set the > "yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user" to yarn > and leave > "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users" > default (true). Then LCE will choose yarn ( UID maybe 1001) as the user > running jobs. > But because LCE will mount the generated launch_container.sh (owned by the > running job user) into the Docker container and utilizes "docker run > --user=<run_as_user>" option to get it done internally, we also need to > create a *same user name* in the Docker image with the *same UID* as the > running job user. Otherwise LCE will fail to launch container or report > unable to find user. This burdens the Docker image creator with YARN > dependency. > Luckily this can be solved through Docker. As far as I know, since Docker > v1.8 (or maybe earlier), the Docker run command "--user=" option accepts UID > and *when passing UID, the user does not have to exist in the container*. So > we should use UID instead of user name to construct the Docker run command to > eliminate the dependency that create the same user in the Docker image. This > enables LCE the ability to launch any Docker container safely regardless what > users in it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org