[ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626455#comment-16626455
 ] 

Chandni Singh edited comment on YARN-7644 at 9/24/18 9:14 PM:
--------------------------------------------------------------

For {{LAUNCH_CONTAINER}}, {{RELAUNCH_CONTAINER}}, {{RECOVER_CONTAINER}}, and 
{{RECOVER_PAUSED_CONTAINER}}, the {{ContainersLauncher}} service creates tasks 
and submits it to the executor to be performed in a non-blocking way:
{code:java}
containerLauncher.submit(launch);
{code}
However, for {{CLEANUP_CONTAINER}}, {{CLEANUP_CONTAINER_FOR_REINIT}}, 
{{SIGNAL_CONTAINER}}, {{PAUSE_CONTAINER}}, {{RESUME_CONTAINER}}, the actions 
are performed in a blocking way.
{code:java}
         launcher.cleanupContainer();
{code}
With this Jira, I can focus on {{CLEANUP_CONTAINER}} and 
{{CLEANUP_CONTAINER_FOR_REINIT}} events to be performed in a non-blocking way.  

Doesn't look the caller ({{ContainerImpl}}) waits anywhere for 
{{cleanupContainer()}} to be performed synchronously. It is triggered by 
dispatching {{ContainersLauncherEventType.CLEANUP_CONTAINER}} events.

 

cc. [~ebadger] [~jlowe]


was (Author: csingh):
For {{LAUNCH_CONTAINER}}, {{RELAUNCH_CONTAINER}}, {{RECOVER_CONTAINER}}, and 
{{RECOVER_PAUSED_CONTAINER}}, the {{ContainersLauncher}} service creates tasks 
and submits it to the executor to be performed in a non-blocking way:
{code:java}
containerLauncher.submit(launch);
{code}
However, for {{CLEANUP_CONTAINER}}, {{CLEANUP_CONTAINER_FOR_REINIT}}, 
{{SIGNAL_CONTAINER}}, {{PAUSE_CONTAINER}}, {{RESUME_CONTAINER}}, the actions 
are performed in a blocking way.
{code:java}
         launcher.cleanupContainer();
{code}
With this Jira, I can focus on {{CLEANUP_CONTAINER}} and 
{{CLEANUP_CONTAINER_FOR_REINIT}} events to be performed in a non-blocking way.  

Doesn't look the caller ({{ContainerImpl}}) waits anywhere for 
{{cleanupContainer()}} to be performed synchronously. It is triggered by 
dispatching {{ContainersLauncherEventType.CLEANUP_CONTAINER}} events.

> NM gets backed up deleting docker containers
> --------------------------------------------
>
>                 Key: YARN-7644
>                 URL: https://issues.apache.org/jira/browse/YARN-7644
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Eric Badger
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: Docker
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to