[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517382#comment-16517382 ] Hudson commented on HADOOP-15527: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14450 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14450/]) HADOOP-15527. Improve delay check for stopping processes. (eyang: rev 2d87592fc6a56bfe77dd3c11953caea2b701c846) * (delete) hadoop-common/src/test/scripts/process_with_sigterm_trap.sh * (add) hadoop-common-project/hadoop-common/src/test/scripts/process_with_sigterm_trap.sh > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517359#comment-16517359 ] Eric Yang commented on HADOOP-15527: [~xiaochen] Good catch, sorry about the wrong location. This has been fixed in trunk and branch-3.1. > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517350#comment-16517350 ] Xiao Chen commented on HADOOP-15527: Hi [~eyang], It turns out the fix is not in the correct place. Could you re-fix it into hadoop-common-project? {noformat} $ ls -R hadoop-common src hadoop-common/src: test hadoop-common/src/test: scripts hadoop-common/src/test/scripts: process_with_sigterm_trap.sh{noformat} > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516364#comment-16516364 ] Vinod Kumar Vavilapalli commented on HADOOP-15527: -- bq. FYI: this patch is a bit fragile due to some assumptions made about the environment. [~aw], I did try to to minimize problems like that to my knowledge, but if you can point out the specific issues, I can fix them.. > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516344#comment-16516344 ] Allen Wittenauer commented on HADOOP-15527: --- FYI: this patch is a bit fragile due to some assumptions made about the environment. > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516146#comment-16516146 ] Akira Ajisaka commented on HADOOP-15527: Thanks! Probably this additional commit will fixes bats test failure. > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516130#comment-16516130 ] Eric Yang commented on HADOOP-15527: [~ajisakaa] Thanks for catching the mistake. Process_with_sigterm_trap.sh has been committed to trunk and branch-3.1. > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516128#comment-16516128 ] Hudson commented on HADOOP-15527: - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14447 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14447/]) HADOOP-15527. Improve delay check for stopping processes. (eyang: rev 2c87ec5affefeb1dc794c4eaae685a4e544f1841) * (add) hadoop-common/src/test/scripts/process_with_sigterm_trap.sh > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9
[ https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516070#comment-16516070 ] Akira Ajisaka commented on HADOOP-15527: Hi [~eyang], process_with_sigterm_trap.sh is missing in the commit. Would you fix it? > loop until TIMEOUT before sending kill -9 > - > > Key: HADOOP-15527 > URL: https://issues.apache.org/jira/browse/HADOOP-15527 > Project: Hadoop Common > Issue Type: Bug >Reporter: Vinod Kumar Vavilapalli >Assignee: Vinod Kumar Vavilapalli >Priority: Major > Fix For: 3.2.0, 3.1.1 > > Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt > > > I'm seeing that sometimes daemons keep running for a little while even after > "kill -9" from daemon-stop scripts. > Debugging more, I see several instances of "ERROR: Unable to kill ${pid}". > Saw this specifically with ResourceManager & NodeManager - {{yarn --daemon > stop nodemanager}}. Though it is possible that other daemons may run into > this too. > Saw this on both Centos as well as Ubuntu. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org