[ 
https://issues.apache.org/jira/browse/YARN-8706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596637#comment-16596637
 ] 

Chandni Singh commented on YARN-8706:
-------------------------------------

{quote}I am not entirely sure about globally identical killing mechanism for 
all container type, is a sane approach to brute force container shutdown.
{quote}
I am not sure what you mean. NM does a graceful shutdown for all types of 
containers. It first sends a {{SIGTERM}} and then after a grace period, sends 
{{SIGKILL}}. 
The {{SIGTERM}} for docker is handled by docker stop, which has the following 
problems:
1. grace period can be specified only in seconds
2. clubs {{SIGKILL}} with stop. Docker first sends a {{STOPSIGNAL}} to the root 
process and then after the grace period, sends {{SIGKILL}} to the root process. 
This is not what NM wants with the stop and docker stop doesn't give any option 
to NOT send {{SIGKILL}}
The proposed change by [~ebadger] will just send the {{STOPSIGNAL}} which 
solves our problem.
{quote}10 seconds default is probably more sensible to give the container a 
chance to shutdown gracefully without causing corruption to data.
{quote}
Why is this specific to docker containers? Other types of containers maybe 
dealing with data and if the default grace period of 250 millis is too small, 
then it can be changed with the config {{NM_SLEEP_DELAY_BEFORE_SIGKILL_MS}}. 
Maybe this should be something that the application could specify as well, but 
that is a different discussion.

> DelayedProcessKiller is executed for Docker containers even though docker 
> stop sends a KILL signal after the specified grace period
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8706
>                 URL: https://issues.apache.org/jira/browse/YARN-8706
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>            Priority: Major
>              Labels: docker
>
> {{DockerStopCommand}} adds a grace period of 10 seconds.
> 10 seconds is also the default grace time use by docker stop
>  [https://docs.docker.com/engine/reference/commandline/stop/]
> Documentation of the docker stop:
> {quote}the main process inside the container will receive {{SIGTERM}}, and 
> after a grace period, {{SIGKILL}}.
> {quote}
> There is a {{DelayedProcessKiller}} in {{ContainerExcecutor}} which executes 
> for all containers after a delay when {{sleepDelayBeforeSigKill>0}}. By 
> default this is set to {{250 milliseconds}} and so irrespective of the 
> container type, it will always get executed.
>  
> For a docker container, {{docker stop}} takes care of sending a {{SIGKILL}} 
> after the grace period
> - when sleepDelayBeforeSigKill > 10 seconds, then there is no point of 
> executing DelayedProcessKiller
> - when sleepDelayBeforeSigKill < 1 second, then the grace period should be 
> the smallest value, which is 1 second, because anyways we are forcing kill 
> after 250 ms
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to