Hello Ahmad, The image being used is not privileged/untrusted based on the settings in container-executor.cfg. In container-executor.cfg you have set docker.privileged-containers.registries=local, but the image name variable in the job is using "hadoop-ubuntu:latest". Based on that setting, YARN is expecting the image to be in the "local" namespace. Can you set YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=local/hadoop-ubuntu:latest and see if that resolves the issue?
Thanks, -Shane On Thu, Apr 19, 2018 at 4:59 PM, SeyyedAhmad Javadi < sjav...@cs.stonybrook.edu> wrote: > Hi All, > > I am following the below guide to setup Docker container run_time but face > some non-trivial errors at least for my level. Would you please comment if > you have some idea about the root-cause? > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop- > yarn-site/DockerContainers.html > > After the error, I have provided the config files and Dockerfile as well > as Docker image inspect command results (should have null for Entry Point > and CMS?). > > I have three nodes, 1 RM and 2 NMs and default LCE works fine. > > ********************submit job script > vars="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_ > RUNTIME_DOCKER_IMAGE=hadoop-ubuntu,YARN_CONTAINER_RUNTIME_ > DOCKER_RUN_OVERRIDE_DISABLE=false,YARN_CONTAINER_RUNTIME_ > DOCKER_CONTAINER_NETWORK=host" > > #vars="YARN_CONTAINER_RUNTIME_TYPE=default" > hadoop jar > /home/ubuntu/hadoop-3.1.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0.jar > pi -Dyarn.app.mapreduce.am.env=$vars -Dmapreduce.map.env=$vars > -Dmapreduce.reduce.env=$vars 2 10 > > ******************** AM Log in one the nodes > 2018-04-19 18:55:46,311 INFO SecurityLogger.org.apache.hadoop.ipc.Server: > Auth successful for appattempt_1524178188987_0001_000001 (auth:SIMPLE) > 2018-04-19 18:55:46,515 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.ContainerManagerImpl: Start request for > container_1524178188987_0001_01_000001 by user ubuntu > 2018-04-19 18:55:46,617 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.ContainerManagerImpl: Creating a new > application reference for app application_1524178188987_0001 > 2018-04-19 18:55:46,634 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.application.ApplicationImpl: Application > application_1524178188987_0001 transitioned from NEW to INITING > 2018-04-19 18:55:46,634 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: > USER=ubuntu IP=130.245.127.176 OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1524178188987_0001 > CONTAINERID=container_1524178188987_0001_01_000001 > 2018-04-19 18:55:46,635 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.application.ApplicationImpl: Adding > container_1524178188987_0001_01_000001 to application > application_1524178188987_0001 > 2018-04-19 18:55:46,649 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.application.ApplicationImpl: Application > application_1524178188987_0001 transitioned from INITING to RUNNING > 2018-04-19 18:55:46,655 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl: Container > container_1524178188987_0001_01_000001 transitioned from NEW to LOCALIZING > 2018-04-19 18:55:46,655 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for > appId application_1524178188987_0001 > 2018-04-19 18:55:46,698 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.localizer.ResourceLocalizationService: > Created localizer for container_1524178188987_0001_01_000001 > 2018-04-19 18:55:46,898 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.localizer.ResourceLocalizationService: > Writing credentials to the nmPrivate file /tmp/hadoop-ubuntu/nm-local- > dir/nmPrivate/container_1524178188987_0001_01_000001.tokens > 2018-04-19 18:55:50,371 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl: Container > container_1524178188987_0001_01_000001 transitioned from LOCALIZING to > SCHEDULED > 2018-04-19 18:55:50,374 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.scheduler.ContainerScheduler: Starting > container [container_1524178188987_0001_01_000001] > 2018-04-19 18:55:50,479 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl: Container > container_1524178188987_0001_01_000001 transitioned from SCHEDULED to > RUNNING > 2018-04-19 18:55:50,481 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting > resource-monitoring for container_1524178188987_0001_01_000001 > 2018-04-19 18:55:51,842 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.launcher.ContainerLaunch: Container > container_1524178188987_0001_01_000001 succeeded > 2018-04-19 18:55:51,844 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl: Container > container_1524178188987_0001_01_000001 transitioned from RUNNING to > EXITED_WITH_SUCCESS > 2018-04-19 18:55:51,844 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up > container container_1524178188987_0001_01_000001 > 2018-04-19 18:55:51,957 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > Removing Docker container : container_1524178188987_0001_01_000001 > 2018-04-19 18:55:56,963 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.launcher.ContainerLaunch: Could not get pid > for container_1524178188987_0001_01_000001. Waited for 5000 ms. > 2018-04-19 18:55:56,963 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.launcher.ContainerLaunch: Unable to obtain > pid, but docker container request detected. Attempting to reap container > container_1524178188987_0001_01_000001 > 2018-04-19 18:55:59,395 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > Deleting absolute path : /tmp/hadoop-ubuntu/nm-local- > dir/usercache/ubuntu/appcache/application_1524178188987_ > 0001/container_1524178188987_0001_01_000001 > 2018-04-19 18:55:59,395 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: > USER=ubuntu OPERATION=Container Finished - Succeeded > TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1524178188987_0001 > CONTAINERID=container_1524178188987_0001_01_000001 > 2018-04-19 18:55:59,403 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl: Container > container_1524178188987_0001_01_000001 transitioned from > EXITED_WITH_SUCCESS to DONE > 2018-04-19 18:55:59,404 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.application.ApplicationImpl: Removing > container_1524178188987_0001_01_000001 from application > application_1524178188987_0001 > 2018-04-19 18:55:59,404 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping > resource-monitoring for container_1524178188987_0001_01_000001 > 2018-04-19 18:55:59,405 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for > appId application_1524178188987_0001 > 2018-04-19 18:56:00,412 INFO > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: > Removed completed containers from NM context: [container_1524178188987_0001_ > 01_000001] > 2018-04-19 18:56:13,455 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.application.ApplicationImpl: Application > application_1524178188987_0001 transitioned from RUNNING to > APPLICATION_RESOURCES_CLEANINGUP > 2018-04-19 18:56:13,457 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > Deleting absolute path : /tmp/hadoop-ubuntu/nm-local- > dir/usercache/ubuntu/appcache/application_1524178188987_0001 > 2018-04-19 18:56:13,459 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for > appId application_1524178188987_0001 > 2018-04-19 18:56:13,470 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.application.ApplicationImpl: Application > application_1524178188987_0001 transitioned from > APPLICATION_RESOURCES_CLEANINGUP > to FINISHED > 2018-04-19 18:56:13,470 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.loghandler.NonAggregatingLogHandler: > Scheduling Log Deletion for application: application_1524178188987_0001, > with delay of 10800 seconds > > ********************** > > > > > yarn-site.xml: according to the above link > container-executor.cfg: > > yarn.nodemanager.linux-container-executor.group=ubuntu > min.user.id=0 > #feature.tc.enabled=1 > #feature.docker.enabled=1 > allowed.system.users=ubuntu > # The configs below deal with settings for Docker > [docker] > module.enabled=true > docker.privileged-containers.enabled=true > docker.binary=/usr/bin/docker > docker.allowed.capabilities=SYS_CHROOT,MKNOD,SETFCAP, > SETPCAP,FSETID,CHOWN,AUDIT_WRITE,SETGID,NET_RAW,FOWNER, > SETUID,DAC_OVERRIDE,KILL,NET_BIND_SERVICE > #docker.allowed.devices=## comma seperated list of devices that can be > mounted into a container > docker.allowed.networks=bridge,host,none > docker.allowed.ro-mounts=/sys/fs/cgroup,/tmp/hadoop-ubuntu/nm-local-dir > docker.privileged-containers.registries=local > #docker.host-pid-namespace.enabled=false > docker.allowed.rw-mounts=/home/ubuntu/hadoop-3.1.0,/ > home/ubuntu/hadoop-3.1.0/logs > > > Dockerfile: > FROM ubuntu:16.04 > #RUN rm /bin/sh && ln -s /bin/bash /bin/sh > SHELL ["/bin/bash", "-c"] > > RUN apt-get update && \ > apt-get upgrade -y && \ > apt-get install -y software-properties-common && \ > # apt-get install -y --no-install-recommends apt-utils && \ > # apt-get install -y curl && \ > add-apt-repository ppa:webupd8team/java -y && \ > apt-get update && \ > echo oracle-java7-installer shared/accepted-oracle-license-v1-1 > select true | /usr/bin/debconf-set-selections && \ > apt-get install -y oracle-java8-installer && \ > # apt-get install -y ssh && \ > # apt-get install -y rsync && \ > apt-get install -y vim && \ > apt-get clean > > ENV JAVA_HOME /usr/lib/jvm/java-8-oracle > ENV PATH $PATH:$JAVA_HOME/bin > > # HADOOP > ARG HADOOP_ARCHIVE=http://mirror.cc.columbia.edu/pub/software/ > apache/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz > > ENV HADOOP_HOME /usr/local/hadoop > ENV HADOOP_COMMON_PATH /usr/local/hadoop > ENV HADOOP_HDFS_HOME /usr/local/hadoop > ENV HADOOP_MAPRED_HOME /usr/local/hadoop > ENV HADOOP_YARN_HOME /usr/local/hadoop > ENV HADOOP_CONF_DIR /usr/local/hadoop/etc/hadoop > > # download and extract hadoop, set JAVA_HOME in hadoop-env.sh, update path > RUN wget $HADOOP_ARCHIVE && \ > tar -xzf hadoop-3.1.0.tar.gz && \ > mv hadoop-3.1.0 $HADOOP_HOME > > ADD rm-hadoop-config/* $HADOOP_HOME/etc/hadoop/ > > ENV PATH $PATH:$HADOOP_COMMON_PATH/bin > > WORKDIR $HADOOP_COMMON_PATH > > # Declare user > RUN groupadd -g 1000 ubuntu && \ > useradd -r -u 1000 -g 1000 ubuntu > USER ubuntu > > > ~/hadoop-common$ docker inspect local/hadoop-ubuntu > [ > { > "Id": "sha256:d8335693084b5823675056b7d649b1 > 3d04a7e3c3b63688f83e9807405506b088", > "RepoTags": [ > "hadoop-ubuntu:latest", > "local/hadoop-ubuntu:latest" > ], > "RepoDigests": [], > "Parent": "sha256:f9b92fa15eadd74b9e3712ee379b56 > bc30a73492db2fd4e7b7f4f1d74e5671f2", > "Comment": "", > "Created": "2018-04-19T15:16:52.903547395Z", > "Container": "4fcff6fa65639fe0e2a9379a335a40 > 1a8cde0c0e1eaaa2dd98d35ced402c29e3", > "ContainerConfig": { > "Hostname": "4fcff6fa6563", > "Domainname": "", > "User": "ubuntu", > "AttachStdin": false, > "AttachStdout": false, > "AttachStderr": false, > "Tty": false, > "OpenStdin": false, > "StdinOnce": false, > "Env": [ > "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/ > sbin:/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/local/hadoop/bin", > "JAVA_HOME=/usr/lib/jvm/java-8-oracle", > "HADOOP_HOME=/usr/local/hadoop", > "HADOOP_COMMON_PATH=/usr/local/hadoop", > "HADOOP_HDFS_HOME=/usr/local/hadoop", > "HADOOP_MAPRED_HOME=/usr/local/hadoop", > "HADOOP_YARN_HOME=/usr/local/hadoop", > "HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop" > ], > "Cmd": [ > "/bin/bash", > "-c", > "#(nop) ", > "USER ubuntu" > ], > "ArgsEscaped": true, > "Image": "sha256:f9b92fa15eadd74b9e3712ee379b56 > bc30a73492db2fd4e7b7f4f1d74e5671f2", > "Volumes": null, > "WorkingDir": "/usr/local/hadoop", > "Entrypoint": null, > "OnBuild": null, > "Labels": {}, > "Shell": [ > "/bin/bash", > "-c" > ] > }, > "DockerVersion": "18.03.0-ce", > "Author": "", > "Config": { > "Hostname": "", > "Domainname": "", > "User": "ubuntu", > "AttachStdin": false, > "AttachStdout": false, > "AttachStderr": false, > "Tty": false, > "OpenStdin": false, > "StdinOnce": false, > "Env": [ > "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/ > sbin:/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/local/hadoop/bin", > "JAVA_HOME=/usr/lib/jvm/java-8-oracle", > "HADOOP_HOME=/usr/local/hadoop", > "HADOOP_COMMON_PATH=/usr/local/hadoop", > "HADOOP_HDFS_HOME=/usr/local/hadoop", > "HADOOP_MAPRED_HOME=/usr/local/hadoop", > "HADOOP_YARN_HOME=/usr/local/hadoop", > "HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop" > ], > "Cmd": [ > "/bin/bash" > ], > "ArgsEscaped": true, > "Image": "sha256:f9b92fa15eadd74b9e3712ee379b56 > bc30a73492db2fd4e7b7f4f1d74e5671f2", > "Volumes": null, > "WorkingDir": "/usr/local/hadoop", > "Entrypoint": null, > "OnBuild": null, > "Labels": null, > "Shell": [ > "/bin/bash", > "-c" > ] > }, > "Architecture": "amd64", > "Os": "linux", > "Size": 2058935914, > "VirtualSize": 2058935914, > "GraphDriver": { > "Data": null, > "Name": "aufs" > }, > "RootFS": { > "Type": "layers", > "Layers": [ > "sha256:fccbfa2912f0cd6b9d13f91f288f11 > 2a2b825f3f758a4443aacb45bfc108cc74", > "sha256:e1a9a6284d0d24d8194ac84b372619 > e75cd35a46866b74925b7274c7056561e4", > "sha256:ac7299292f8b2f710d3b911c6a4e02 > ae8f06792e39822e097f9c4e9c2672b32d", > "sha256:a5e66470b2812e91798db36eb103c1 > f1e135bbe167e4b2ad5ba425b8db98ee8d", > "sha256:a8de0e025d94b33db3542e1e8ce588 > 29144b30c6cd1fff057eec55b1491933c3", > "sha256:7e9a788452589001d42e7995dc0583 > bcca1e6f7780a301066ee0d6668aaf9c91", > "sha256:65fac5b99df0506e1d204d90242641 > 05ab2fe142ea970f0eba630669dc606055", > "sha256:cd0da20f97700ab8e2ec37c464fcb8 > 864cba86672aa96e7fc40e2572228895e2", > "sha256:21094366bc298af4a7e83680cf9d73 > 2ce69cb92b11a3b20005b12c15bac3e486" > ] > }, > "Metadata": { > "LastTagTime": "2018-04-19T17:20:41.054733026-04:00" > } > } > ] > > > > > Best, > Ahmad > >