[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-09-08 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158542#comment-16158542
 ] 

Shane Kumpf commented on YARN-4759:
---

Thanks for the follow up [~ebadger]

{quote}
I thought that you had decided that we didn't need to worry about this in your 
comment above?
{quote}

I'm actually saying the opposite. My initial thought was to allow the user to 
tell YARN the stop/kill signal when submitting the job. However, after more 
research I found STOPSIGNAL, which means YARN doesn't need to explicitly handle 
this and the user can define the necessary signal via the Dockerfile. This 
depends on using {{docker stop}} though.

{quote}
How does docker stop solve the issue here? If the container doesn't exist yet, 
then docker stop will fail with "No such container" and stop trying. The 
documentation isn't very informative, but it doesn't appear to wait the grace 
period for the SIGKILL if it can't find the container in the first place.
{quote}

Sorry, I wasn't very clear before, I'm referring to a different situation. The 
container can exist, but the process inside the container may not be fully 
started and/or Docker has not yet written the PID to the data structure used by 
{{docker inspect}}. We use {{docker run}}, which does a {{docker create}} and 
{{docker start}} behind the scenes. If the image doesn't exist it is implicitly 
pulled during that time as well. You will often find the Created and StartedAt 
times in {{docker inspect}} differ wildly due to additional background 
operations. I will concede that {{docker stop}} is less necessary here, as a 
container still in Created state can be {{docker rm}}-ed (well, most of the 
time that is, but that's another discussion). However, the docker client is 
decoupled from YARN, so it's quite possible for races to occur and containers 
to become leaked, so it may still be useful in case the container has 
transitioned to running while we attempt to obtain the PID, etc.

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-09-07 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157776#comment-16157776
 ] 

Eric Badger commented on YARN-4759:
---

bq. There are two reasons why it would be good to continue using docker 
stop/docker kill. Dockerfile supports the STOPSIGNAL directive. If a particular 
signal is needed to gracefully stop the process, the user can define the signal 
via the Dockerfile that is sent when docker stop is called.
I thought that you had decided that we didn't need to worry about this in your 
[comment|https://issues.apache.org/jira/browse/YARN-4759?focusedCommentId=15241924&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15241924]
 above? 

bq. During that time the PID may not be available via docker inspect, so pid 
based signalling breaks down and may leave "leaked" running containers. We need 
to use docker stop in that case
How does {{docker stop}} solve the issue here? If the container doesn't exist 
yet, then docker stop will fail with "No such container" and stop trying. The 
documentation isn't very informative, but it doesn't appear to wait the grace 
period for the SIGKILL if it can't find the container in the first place. 

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-09-05 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154031#comment-16154031
 ] 

Shane Kumpf commented on YARN-4759:
---

[~ebadger] - There are two reasons why it would be good to continue using 
{{docker stop}}/{{docker kill}}. Dockerfile supports the STOPSIGNAL directive. 
If a particular signal is needed to gracefully stop the process, the user can 
define the signal via the Dockerfile that is sent when {{docker stop}} is 
called. The second scenario is for very short lived containers. In my 
experience, ~3 seconds is the lower bound for starting up a container. During 
that time the PID may not be available via {{docker inspect}}, so pid based 
signalling breaks down and may leave "leaked" running containers. We need to 
use {{docker stop}} in that case. I agree that we need to improve the exception 
handling and I will pursue that.

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-09-01 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151070#comment-16151070
 ] 

Eric Badger commented on YARN-4759:
---

[~shaneku...@gmail.com], another question (hopefully you didn't answer this one 
already in another JIRA). Is it necessary for us to use {{docker 
stop}}/{{docker kill}} to send signals to the processes within the docker 
container, specifically during shutdown? If the docker containers have already 
exited, then the {{docker stop}} command will cause an exception because the 
command failed. In the non-docker signaling case, the container-executor will 
check for whether the process still exists before it sends the signal and will 
send a specific error code back that we can safely ignore (and then log in 
DEBUG) in the even that it doesn't exist. But since we exec the {{docker stop}} 
command, we will get the return code of whatever that command gives, since we 
lose control after the exec. In the case of a container that doesn't exist, 
{{docker stop}} returns 1. This exception is spamming the NM log for me. From 
what I understand, docker will always send the signal to PID 1 (until we assume 
Docker 1.13 support, which has the {{--init}} flag to start and reap all 
processes). But this is fine, because PID 1 for docker containers is {{bash 
-c}} and bash should forward that signal along to its child process, since they 
have the same process groups.

tl;dr Can we use the same signaling code in docker that we use in non-docker so 
that we can get rid of these benign exceptions in the NM log or is there a 
reason we need to use the {{docker stop}} and {{docker kill}} commands?

cc [~vvasudev]

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-08-29 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145819#comment-16145819
 ] 

Eric Badger commented on YARN-4759:
---

Ah ok, I didn't read far enough since the title/summary of that JIRA only talk 
about removal of containers. Thanks!

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-08-29 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145803#comment-16145803
 ] 

Shane Kumpf commented on YARN-4759:
---

[~ebadger] - I do have a draft patch for YARN-5366 that will support what you 
are asking. I'm expect to get that moving forward again in the very near future 
as YARN-6623 wraps up.

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2017-08-29 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145596#comment-16145596
 ] 

Eric Badger commented on YARN-4759:
---

[~shaneku...@gmail.com], is there a way in this approach to support sending 
signals to a running container (e.g. SIGQUIT for jstack/heap dump)? From the 
code, it looks like all signals are handled as either signal 0 or "other" with 
"other" stopping the container. In the case of a jstack/heap dump we don't want 
to stop the container, just send the signal. If there's no way currently, I can 
file a JIRA and work to support this functionality

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2016-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377002#comment-15377002
 ] 

Hudson commented on YARN-4759:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10098 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10098/])
YARN-4759. Fix signal handling for docker containers. Contributed by (vvasudev: 
rev e5e558b0a34968eaffdd243ce605ef26346c5e85)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerStopCommandTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestDockerContainerRuntime.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerStopCommand.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Fix For: 2.9.0
>
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2016-07-14 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376959#comment-15376959
 ] 

Varun Vasudev commented on YARN-4759:
-

+1 for the latest patch. Committing this.

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2016-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376898#comment-15376898
 ] 

Hadoop QA commented on YARN-4759:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
6s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 18 unchanged - 0 fixed = 21 total (was 18) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 6s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 40s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817941/YARN-4759.003.patch |
| JIRA Issue | YARN-4759 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 937eafa1114e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / be26c1b |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12327/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12327/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12327/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Fix signal handling for docker containers
> ---

[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2016-07-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376863#comment-15376863
 ] 

Hadoop QA commented on YARN-4759:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 18 unchanged - 0 fixed = 21 total (was 18) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 59s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 34s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManagerRegression |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817935/YARN-4759.003.patch |
| JIRA Issue | YARN-4759 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux d4a5d4345b79 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / be26c1b |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12326/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12326/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12326/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2016-07-14 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376852#comment-15376852
 ] 

Shane Kumpf commented on YARN-4759:
---

Jenkins slave failed again. Will reattach again.

{code}


  maven site verification: trunk




cd 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
mvn -Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-trunk-patch-1 -Ptest-patch 
clean site site:stage > 
/testptch/hadoop/patchprocess/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 2>&1
Slave went offline during the build
ERROR: Connection was broken: java.io.IOException: Sorry, this connection is 
closed.
at 
com.trilead.ssh2.transport.TransportManager.ensureConnected(TransportManager.java:587)
at 
com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:660)
at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:407)
at com.trilead.ssh2.channel.Channel.freeupWindow(Channel.java:347)
at 
com.trilead.ssh2.channel.ChannelManager.getChannelData(ChannelManager.java:943)
at 
com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:58)
at 
com.trilead.ssh2.channel.ChannelInputStream.read(ChannelInputStream.java:79)
at 
hudson.remoting.FlightRecorderInputStream.read(FlightRecorderInputStream.java:82)
at 
hudson.remoting.ChunkedInputStream.readHeader(ChunkedInputStream.java:72)
at 
hudson.remoting.ChunkedInputStream.readUntilBreak(ChunkedInputStream.java:103)
at 
hudson.remoting.ChunkedCommandTransport.readBlock(ChunkedCommandTransport.java:39)
at 
hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
com.trilead.ssh2.crypto.cipher.CipherOutputStream.flush(CipherOutputStream.java:75)
at 
com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:193)
at 
com.trilead.ssh2.transport.TransportConnection.sendMessage(TransportConnection.java:107)
at 
com.trilead.ssh2.transport.TransportManager.sendMessage(TransportManager.java:677)
at 
com.trilead.ssh2.transport.TransportManager$AsynchronousWorker.run(TransportManager.java:115)
{code}

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4759) Fix signal handling for docker containers

2016-07-14 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376830#comment-15376830
 ] 

Shane Kumpf commented on YARN-4759:
---

It appears the jenkins slave failed during the precommit job. I'm also only 
showing the two previously called out checkstyle bugs when running locally, not 
three as shown above. Reattaching the same patch to rerun the jenkins job.

> Fix signal handling for docker containers
> -
>
> Key: YARN-4759
> URL: https://issues.apache.org/jira/browse/YARN-4759
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Shane Kumpf
> Attachments: YARN-4759.001.patch, YARN-4759.002.patch, 
> YARN-4759.003.patch
>
>
> The current signal handling (in the DockerContainerRuntime) needs to be 
> revisited for docker containers. For example, container reacquisition on NM 
> restart might not work, depending on which user the process in the container 
> runs as. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org