[ https://issues.apache.org/jira/browse/YARN-8181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinod Kumar Vavilapalli resolved YARN-8181. ------------------------------------------- Resolution: Invalid [~sajavadi], please see http://hadoop.apache.org/mailing_lists.html. You can send emails to u...@hadoop.apache.org. You can subscribe to the list for other related discussions. Resolving this for now. > Docker container run_time > ------------------------- > > Key: YARN-8181 > URL: https://issues.apache.org/jira/browse/YARN-8181 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Seyyed Ahmad Javadi > Priority: Major > > Hi All, > I want to use docker container run time but could not solve the facing > problem. I am following the guide below and the NM log is as follows. I can > not see any docker containers to be created. It works when I use default LCE. > Please also find how I submit a job at the end as well. > Do you have any guide on how can I make Docker rum_time works? > May you please let me know how can use LCE binary to make sure my docker > setup is correct? > I confirmed that "docker run" works fine. I really like this developing > feature and would like to contribute to it. Many thanks in advance. > [https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/DockerContainers.html] > {code:java} > NM LOG: > ... > 2018-04-19 11:49:24,568 INFO SecurityLogger.org.apache.hadoop.ipc.Server: > Auth successful for appattempt_1524151293356_0005_000001 (auth:SIMPLE) > 2018-04-19 11:49:24,580 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Start request for container_1524151293356_0005_01_000001 by user ubuntu > 2018-04-19 11:49:24,584 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Creating a new application reference for app application_1524151293356_0005 > 2018-04-19 11:49:24,584 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu > IP=130.245.127.176 OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1524151293356_0005 > CONTAINERID=container_1524151293356_0005_01_000001 > 2018-04-19 11:49:24,585 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Application application_1524151293356_0005 transitioned from NEW to INITING > 2018-04-19 11:49:24,585 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Adding container_1524151293356_0005_01_000001 to application > application_1524151293356_0005 > 2018-04-19 11:49:24,585 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Application application_1524151293356_0005 transitioned from INITING to > RUNNING > 2018-04-19 11:49:24,588 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_1524151293356_0005_01_000001 transitioned from NEW to > LOCALIZING > 2018-04-19 11:49:24,588 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got > event CONTAINER_INIT for appId application_1524151293356_0005 > 2018-04-19 11:49:24,589 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Created localizer for container_1524151293356_0005_01_000001 > 2018-04-19 11:49:24,616 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Writing credentials to the nmPrivate file > /tmp/hadoop-ubuntu/nm-local-dir/nmPrivate/container_1524151293356_0005_01_000001.tokens > 2018-04-19 11:49:28,090 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_1524151293356_0005_01_000001 transitioned from > LOCALIZING to SCHEDULED > 2018-04-19 11:49:28,090 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: > Starting container [container_1524151293356_0005_01_000001] > 2018-04-19 11:49:28,212 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_1524151293356_0005_01_000001 transitioned from SCHEDULED > to RUNNING > 2018-04-19 11:49:28,212 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Starting resource-monitoring for container_1524151293356_0005_01_000001 > 2018-04-19 11:49:29,401 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Container container_1524151293356_0005_01_000001 succeeded > 2018-04-19 11:49:29,401 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_1524151293356_0005_01_000001 transitioned from RUNNING > to EXITED_WITH_SUCCESS > 2018-04-19 11:49:29,401 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Cleaning up container container_1524151293356_0005_01_000001 > 2018-04-19 11:49:29,520 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Removing > Docker container : container_1524151293356_0005_01_000001 > 2018-04-19 11:49:34,517 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Could not get pid for container_1524151293356_0005_01_000001. Waited for > 5000 ms. > 2018-04-19 11:49:34,517 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Unable to obtain pid, but docker container request detected. Attempting to > reap container container_1524151293356_0005_01_000001 > 2018-04-19 11:49:36,927 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1524151293356_0005/container_1524151293356_0005_01_000001 > 2018-04-19 11:49:36,928 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=ubuntu > OPERATION=Container Finished - Succeeded TARGET=ContainerImpl > RESULT=SUCCESS APPID=application_1524151293356_0005 > CONTAINERID=container_1524151293356_0005_01_000001 > 2018-04-19 11:49:36,929 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_1524151293356_0005_01_000001 transitioned from > EXITED_WITH_SUCCESS to DONE > 2018-04-19 11:49:36,938 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Removing container_1524151293356_0005_01_000001 from application > application_1524151293356_0005 > 2018-04-19 11:49:36,938 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > Stopping resource-monitoring for container_1524151293356_0005_01_000001 > 2018-04-19 11:49:36,938 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got > event CONTAINER_STOP for appId application_1524151293356_0005 > 2018-04-19 11:49:37,941 INFO > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed > completed containers from NM context: [container_1524151293356_0005_01_000001] > 2018-04-19 11:49:50,966 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Application application_1524151293356_0005 transitioned from RUNNING to > APPLICATION_RESOURCES_CLEANINGUP > 2018-04-19 11:49:50,967 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /tmp/hadoop-ubuntu/nm-local-dir/usercache/ubuntu/appcache/application_1524151293356_0005 > 2018-04-19 11:49:50,967 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got > event APPLICATION_STOP for appId application_1524151293356_0005 > 2018-04-19 11:49:50,967 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Application application_1524151293356_0005 transitioned from > APPLICATION_RESOURCES_CLEANINGUP to FINISHED > 2018-04-19 11:49:50,967 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler: > Scheduling Log Deletion for application: application_1524151293356_0005, > with delay of 10800 seconds > {code} > {code:java} > vars="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-ubuntu:latest" > #vars="YARN_CONTAINER_RUNTIME_TYPE=docker,YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-ubuntu:latest,YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE=false,YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=host" > #vars="YARN_CONTAINER_RUNTIME_TYPE=default" > hadoop jar > /home/ubuntu/hadoop-3.1.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.0.jar > pi -Dyarn.app.mapreduce.am.env=$vars -Dmapreduce.map.env=$vars > -Dmapreduce.reduce.env=$vars 2 10 > {code} > {code:java} > Number of Maps = 2 > Samples per Map = 10 > Wrote input for Map #0 > Wrote input for Map #1 > Starting Job > 2018-04-19 11:49:22,786 INFO client.RMProxy: Connecting to ResourceManager at > bay1-vm1/130.245.127.176:8032 > 2018-04-19 11:49:23,435 INFO mapreduce.JobResourceUploader: Disabling Erasure > Coding for path: > /tmp/hadoop-yarn/staging/ubuntu/.staging/job_1524151293356_0005 > 2018-04-19 11:49:23,601 INFO input.FileInputFormat: Total input files to > process : 2 > 2018-04-19 11:49:23,756 INFO mapreduce.JobSubmitter: number of splits:2 > 2018-04-19 11:49:23,824 INFO Configuration.deprecation: > yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, > use yarn.system-metrics-publisher.enabled > 2018-04-19 11:49:24,015 INFO mapreduce.JobSubmitter: Submitting tokens for > job: job_1524151293356_0005 > 2018-04-19 11:49:24,017 INFO mapreduce.JobSubmitter: Executing with tokens: [] > 2018-04-19 11:49:24,262 INFO conf.Configuration: resource-types.xml not found > 2018-04-19 11:49:24,262 INFO resource.ResourceUtils: Unable to find > 'resource-types.xml'. > 2018-04-19 11:49:24,350 INFO impl.YarnClientImpl: Submitted application > application_1524151293356_0005 > 2018-04-19 11:49:24,398 INFO mapreduce.Job: The url to track the job: > http://bay1-vm1:8088/proxy/application_1524151293356_0005/ > 2018-04-19 11:49:24,399 INFO mapreduce.Job: Running job: > job_1524151293356_0005 > 2018-04-19 11:49:50,658 INFO mapreduce.Job: Job job_1524151293356_0005 > running in uber mode : false > 2018-04-19 11:49:50,660 INFO mapreduce.Job: map 0% reduce 0% > 2018-04-19 11:49:50,676 INFO mapreduce.Job: Job job_1524151293356_0005 failed > with state FAILED due to: Application application_1524151293356_0005 failed 2 > times due to AM Container for appattempt_1524151293356_0005_000002 exited > with exitCode: 0 > Failing this attempt.Diagnostics: For more detailed output, check the > application tracking page: > http://bay1-vm1:8088/cluster/app/application_1524151293356_0005 Then click on > links to logs of each attempt. > . Failing the application. > 2018-04-19 11:49:50,702 INFO mapreduce.Job: Counters: 0 > Job job_1524151293356_0005 failed! > runtime in seconds: 34 > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org