[ https://issues.apache.org/jira/browse/YARN-9074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713167#comment-16713167 ]
Eric Yang edited comment on YARN-9074 at 12/7/18 6:28 PM: ---------------------------------------------------------- [~uranus] {quote}we reap the container twice in ContainerCleanup{quote} This was done by developer that claim docker does not remove container sometimes. This used to happen with older version of docker or broken containers. I don't have opinion on if the remove should happen only once. {quote}we can mv _docker rm_ code after _docker stop_ code{quote} I am having trouble to understand this statement with previous conversation. Delayed removal is a feature to create a grace period of waiting time to debug container before clean up. If docker rm runs immediately after docker stop, then debug feature can not take place. However, I am interested to see if there is any corner cases that were not well covered by current clean up logic. was (Author: eyang): [~uranus] \{quote}we reap the container twice in ContainerCleanup\{quote} This was done by developer that claim docker does not remove container sometimes. This used to happen with older version of docker or broken containers. I don't have opinion on if the remove should happen only once. {quote}we can mv _docker rm_ code after _docker stop_ code\{quote} I am having trouble to understand this statement with previous conversation. Delayed removal is a feature to create a grace period of waiting time to debug container before clean up. If docker rm runs immediately after docker stop, then debug feature can not take place. However, I am interested to see if there is any corner cases that were not well covered by current clean up logic. > Docker container rm command should be executed after stop > --------------------------------------------------------- > > Key: YARN-9074 > URL: https://issues.apache.org/jira/browse/YARN-9074 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Zhaohui Xin > Assignee: Zhaohui Xin > Priority: Major > Attachments: YARN-9074.001.patch, image-2018-12-01-11-36-12-448.png, > image-2018-12-01-11-38-18-191.png > > > {code:java} > @Override > public void transition(ContainerImpl container, ContainerEvent event) { > container.setIsReInitializing(false); > // Set exit code to 0 on success > container.exitCode = 0; > // TODO: Add containerWorkDir to the deletion service. > if (DockerLinuxContainerRuntime.isDockerContainerRequested( > container.daemonConf, > container.getLaunchContext().getEnvironment())) { > removeDockerContainer(container); > } > if (clCleanupRequired) { > container.dispatcher.getEventHandler().handle( > new ContainersLauncherEvent(container, > ContainersLauncherEventType.CLEANUP_CONTAINER)); > } > container.cleanup(); > }{code} > Now, when container is finished, NM firstly execute "_docker rm xxx"_ to > remove it and this thread is placed in DeletionService. see more in YARN-5366 > . > Next, NM will execute "_docker stop_" and "docker kill" command. these tow > commands are wrapped up in ContainerCleanup thread and executed by > ContainersLauncher. see more in YARN-7644. > The above will cause the container's cleanup to be split into two threads. I > think we should refactor these code to make all docker container killing > process be place in ContainerCleanup thread and "_docker rm_" should be > executed last. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org