[ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643801#comment-16643801
 ] 

Chandni Singh commented on YARN-7644:
-------------------------------------

Addressed [~jlowe]'s review comments in patch 3.
 * Made {{sleepDelayBeforeSigKill}} final.

 * Made {{ContainerCleanup}} not accesses the variables in {{ContainerLaunch}} 
directly. Added access methods in {{ContainerLaunch}}.  My preference is to 
keep {{ContainerCleanup}} outside {{ContainerLaunch}} because:
   *#  {{ContainerLauncher}} needs to be able to access {{ContainerCleanup}} to 
create an instance of this task. 
    *# {{ContainerLaunch}} is already quite big (approx. 2000 lines). 

 * Did not change the access modifiers of {{pidFilePath}} and 
{{containerAlreadyLaunched}} in {{ContainerLaunch}} since the other classes - 
{{ContainerRelaunch}}, {{RecoveredContainerLaunch}}, 
{{RecoveredPausedContainerLaunch}} which are extensions of {{ContainerLaunch}} 
access them directly.

 * Created https://issues.apache.org/jira/browse/YARN-8861 to change the name 
of variable {{executorLock}} in {{ContainerLaunch}}



> NM gets backed up deleting docker containers
> --------------------------------------------
>
>                 Key: YARN-7644
>                 URL: https://issues.apache.org/jira/browse/YARN-7644
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Eric Badger
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: Docker
>         Attachments: YARN-7644.001.patch, YARN-7644.002.patch, 
> YARN-7644.003.patch
>
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to