[
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560607#comment-14560607
]
Varun Vasudev commented on YARN-3678:
-------------------------------------
[~zhiguohong] thanks for the detailed explanation! When you say your fix
reduced the rate to nearly zero, do you know why the accidental kill continued
to happen?
> DelayedProcessKiller may kill other process other than container
> ----------------------------------------------------------------
>
> Key: YARN-3678
> URL: https://issues.apache.org/jira/browse/YARN-3678
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.6.0
> Reporter: gu-chi
> Priority: Critical
>
> Suppose one container finished, then it will do clean up, the PID file still
> exist and will trigger once singalContainer, this will kill the process with
> the pid in PID file, but as container already finished, so this PID may be
> occupied by other process, this may cause serious issue.
> As I know, my NM was killed unexpectedly, what I described can be the cause.
> Even rarely occur.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)