[ https://issues.apache.org/jira/browse/YARN-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300927#comment-15300927 ]
Hudson commented on YARN-4459: ------------------------------ SUCCESS: Integrated in Hadoop-trunk-Commit #9861 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9861/]) YARN-4459. container-executor should only kill process groups. (jlowe: rev 1ba31fe9e906dbd093afd4b254216601967a4a7b) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c > container-executor should only kill process groups > -------------------------------------------------- > > Key: YARN-4459 > URL: https://issues.apache.org/jira/browse/YARN-4459 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Reporter: Jun Gong > Assignee: Jun Gong > Attachments: YARN-4459.01.patch, YARN-4459.02.patch, > YARN-4459.03.patch > > > When calling 'signal_container_as_user' in container-executor, it first > checks whether process group exists, if not, it will kill the process > itself(if it the process exists). It is not reasonable because that the > process group does not exist means corresponding container has finished, if > we kill the process itself, we just kill wrong process. > We found it happened in our cluster many times. We used same account for > starting NM and submitted app, and container-executor sometimes killed NM(the > wrongly killed process might just be a newly started thread and was NM's > child process). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org