[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517382#comment-16517382
 ] 

Hudson commented on HADOOP-15527:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14450 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14450/])
HADOOP-15527.  Improve delay check for stopping processes.   
(eyang: rev 2d87592fc6a56bfe77dd3c11953caea2b701c846)
* (delete) hadoop-common/src/test/scripts/process_with_sigterm_trap.sh
* (add) 
hadoop-common-project/hadoop-common/src/test/scripts/process_with_sigterm_trap.sh


> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517359#comment-16517359
 ] 

Eric Yang commented on HADOOP-15527:


[~xiaochen] Good catch, sorry about the wrong location.  This has been fixed in 
trunk and branch-3.1.

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-19 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517350#comment-16517350
 ] 

Xiao Chen commented on HADOOP-15527:


Hi [~eyang],

It turns out the fix is not in the correct place. Could you re-fix it into 
hadoop-common-project?
{noformat}

$ ls -R hadoop-common
src

hadoop-common/src:
test

hadoop-common/src/test:
scripts

hadoop-common/src/test/scripts:
process_with_sigterm_trap.sh{noformat}

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516364#comment-16516364
 ] 

Vinod Kumar Vavilapalli commented on HADOOP-15527:
--

bq. FYI: this patch is a bit fragile due to some assumptions made about the 
environment.
[~aw], I did try to to minimize problems like that to my knowledge, but if you 
can point out the specific issues, I can fix them..

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516344#comment-16516344
 ] 

Allen Wittenauer commented on HADOOP-15527:
---

FYI: this patch is a bit fragile due to some assumptions made about the 
environment.

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516146#comment-16516146
 ] 

Akira Ajisaka commented on HADOOP-15527:


Thanks! Probably this additional commit will fixes bats test failure.

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516130#comment-16516130
 ] 

Eric Yang commented on HADOOP-15527:


[~ajisakaa] Thanks for catching the mistake.  Process_with_sigterm_trap.sh has 
been committed to trunk and branch-3.1.

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516128#comment-16516128
 ] 

Hudson commented on HADOOP-15527:
-

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14447 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14447/])
HADOOP-15527.  Improve delay check for stopping processes.   
(eyang: rev 2c87ec5affefeb1dc794c4eaae685a4e544f1841)
* (add) hadoop-common/src/test/scripts/process_with_sigterm_trap.sh


> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15527) loop until TIMEOUT before sending kill -9

2018-06-18 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516070#comment-16516070
 ] 

Akira Ajisaka commented on HADOOP-15527:


Hi [~eyang], process_with_sigterm_trap.sh is missing in the commit. Would you 
fix it?

> loop until TIMEOUT before sending kill -9
> -
>
> Key: HADOOP-15527
> URL: https://issues.apache.org/jira/browse/HADOOP-15527
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
> Fix For: 3.2.0, 3.1.1
>
> Attachments: HADOOP-15527.1.txt, HADOOP-15527.2.txt, HADOOP-15527.txt
>
>
> I'm seeing that sometimes daemons keep running for a little while even after 
> "kill -9" from daemon-stop scripts.
> Debugging more, I see several instances of "ERROR: Unable to kill ${pid}".
> Saw this specifically with ResourceManager & NodeManager -  {{yarn --daemon 
> stop nodemanager}}. Though it is possible that other daemons may run into 
> this too.
> Saw this on both Centos as well as Ubuntu.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org