[ https://issues.apache.org/jira/browse/YARN-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448353#comment-16448353 ]
Eric Badger commented on YARN-6495: ----------------------------------- I think that the whole error checking is related. It shouldn't be a big patch regardless, but I think we should write {{write_pid_to_cgroup_as_root()}} to handle different errors, and to check the docker exit code when necessary. > check docker container's exit code when writing to cgroup task files > -------------------------------------------------------------------- > > Key: YARN-6495 > URL: https://issues.apache.org/jira/browse/YARN-6495 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Jaeboo Jeong > Assignee: Jaeboo Jeong > Priority: Major > Attachments: YARN-6495.001.patch, YARN-6495.002.patch > > > If I execute simple command like date on docker container, the application > failed to complete successfully. > for example, > {code} > $ yarn jar > $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env > YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=hadoop-docker -shell_command "date" -jar > $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar > -num_containers 1 -timeout 3600000 > … > 17/04/12 00:16:40 INFO distributedshell.Client: Application did finished > unsuccessfully. YarnState=FINISHED, DSFinalStatus=FAILED. Breaking monitoring > loop > 17/04/12 00:16:40 ERROR distributedshell.Client: Application failed to > complete successfully > {code} > The error log is like below. > {code} > ... > Failed to write pid to file > /cgroup_parent/cpu/hadoop-yarn/container_xxxx/tasks - No such process > ... > {code} > When writing pid to cgroup tasks, container-executor doesn’t check docker > container’s status. > If the container finished very quickly, we can’t write pid to cgroup tasks, > and it is not problem. > So container-executor needs to check docker container’s exit code during > writing pid to cgroup tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org