date:20190708



[ 
https://issues.apache.org/jira/browse/YARN-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880090#comment-16880090
 ] 

Hadoop QA commented on YARN-9608:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  7m 
29s{color} | {color:red} Docker failed to build yetus/hadoop:20ca677. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9608 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973889/YARN-9608-branch-3.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24346/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> DecommissioningNodesWatcher should get lists of running applications on node 
> from RMNode.
> -
>
> Key: YARN-9608
> URL: https://issues.apache.org/jira/browse/YARN-9608
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abhishek Modi
>Assignee: Abhishek Modi
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9608-branch-3.001.patch, YARN-9608.001.patch, 
> YARN-9608.002.patch
>
>
> At present, DecommissioningNodesWatcher tracks list of running applications 
> and triggers decommission of nodes when all the applications that ran on the 
> node completes. This Jira proposes to solve following problem:
>  # DecommissioningNodesWatcher skips tracking application containers on a 
> particular node before the node is in DECOMMISSIONING state. It only tracks 
> containers once the node is in DECOMMISSIONING state. This can lead to 
> shuffle data loss of apps whose containers ran on this node before it was 
> moved to decommissioning state.
>  # It is keeping track of running apps. We can leverage this directly from 
> RMNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-9667) Container-executor.c duplicates messages to stdout

2019-07-08 Thread Adam Antal (JIRA)

Adam Antal created YARN-9667:

Summary: Container-executor.c duplicates messages to stdout
Key: YARN-9667
URL: https://issues.apache.org/jira/browse/YARN-9667
Project: Hadoop YARN
Issue Type: Improvement
Components: nodemanager, yarn
Affects Versions: 3.2.0
Reporter: Adam Antal

When a container is killed by its AM we get a similar error message like this:
{noformat}
2019-06-30 12:09:04,412 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
Shell execution returned exit code: 143. Privileged Execution Operation Stderr:

Stdout: main : command provided 1
main : run as user is systest
main : requested yarn user is systest
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file
/yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
{noformat}
In container-executor.c the fork point is right after the "Creating script
paths..." part, though in the Stdout log we can clearly see it has been written
there twice. After consulting with [~pbacsko] it seems like there's a missing
flush in container-executor.c before the fork and that causes the duplication.

I suggest to add a flush there so that it won't be duplicated: it's a bit
misleading that the child process writes out "Getting exit code file" and
"Creating script paths" even though it is clearly not doing that.

A more appealing solution could be to revisit the fprintf-fflush pairs in the
code and change them to a single call, so that the fflush calls would not be
forgotten accidentally. (It can cause problems in every place where it's used).

Note: this issue probably affects every occasion of fork(), not just the one
from {{launch_container_as_user}} in {{main.c}}.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880230#comment-16880230
 ] 

Peter Bacsko commented on YARN-9667:


I suggest considering two things:

1) When setting ERRORFILE or LOGFILE, just invoke {{setbuf(ERRORFILE, NULL);}} 
and {{setbuf(LOGFILE, NULL);}}. Then writing the output will be written to the 
stream immediately without buffering.

2) We should use {{_exit()}} in the child process. After doing some research, 
this is the preferred method of exiting from a child process. There are some 
exceptions, but in general this seems to be the commonly used way of 
terminating a child. Eg. if {{execvp()}} fails then just bail out, avoid any 
sort of cleanup. This avoids flusing buffers twice, plus avoids calling 
functions registered with {{atexit()}} twice, which probably we don't have 
right now, but guarding against a potential bug is always a good thing.

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880234#comment-16880234
 ] 

Peter Bacsko commented on YARN-9667:


cc [~aceric] what do you think? Not sure who should be involved to discuss this.

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880230#comment-16880230
 ] 

Peter Bacsko edited comment on YARN-9667 at 7/8/19 11:35 AM:
-

I suggest considering two things:

1) When setting {{ERRORFILE}} and {{LOGFILE}}, just invoke {{setbuf(ERRORFILE, 
NULL);}} and {{setbuf(LOGFILE, NULL);}}. Then writing the output will be 
written to the stream immediately without buffering.

2) We should use {{_exit()}} in the child process. After doing some research, 
this is the preferred method of exiting from a child process. There are some 
exceptions, but in general this seems to be the commonly used way of 
terminating a child. Eg. if {{execvp()}} fails then just bail out, avoid any 
sort of cleanup. This avoids flusing buffers twice, plus avoids calling 
functions registered with {{atexit()}} twice, which probably we don't have 
right now, but guarding against a potential bug is always a good thing.


was (Author: pbacsko):
I suggest considering two things:

1) When setting ERRORFILE or LOGFILE, just invoke {{setbuf(ERRORFILE, NULL);}} 
and {{setbuf(LOGFILE, NULL);}}. Then writing the output will be written to the 
stream immediately without buffering.

2) We should use {{_exit()}} in the child process. After doing some research, 
this is the preferred method of exiting from a child process. There are some 
exceptions, but in general this seems to be the commonly used way of 
terminating a child. Eg. if {{execvp()}} fails then just bail out, avoid any 
sort of cleanup. This avoids flusing buffers twice, plus avoids calling 
functions registered with {{atexit()}} twice, which probably we don't have 
right now, but guarding against a potential bug is always a good thing.

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9660) Enhance documentation of Docker on YARN support

2019-07-08 Thread Shane Kumpf (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880265#comment-16880265
 ] 

Shane Kumpf commented on YARN-9660:
---

Thanks for the improvements [~pbacsko]! lgtm +1

> Enhance documentation of Docker on YARN support
> ---
>
> Key: YARN-9660
> URL: https://issues.apache.org/jira/browse/YARN-9660
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation, nodemanager
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9660-001.patch
>
>
> Right now, using Docker on YARN has some hard requirements. If these 
> requirements are not met, then launching the containers will fail and and 
> error message will be printed. Depending on how familiar the user is with 
> Docker, it might or might not be easy for them to understand what went wrong 
> and how to fix the underlying problem.
> It would be important to explicitly document these requirements along with 
> the error messages.
> *#1: CGroups handler cannot be systemd*
> If docker deamon runs with systemd cgroups handler, we receive the following 
> error upon launching a container:
> {noformat}
> Container id: container_1561638268473_0006_01_02
> Exit code: 7
> Exception message: Launch container failed
> Shell error output: /usr/bin/docker-current: Error response from daemon: 
> cgroup-parent for systemd cgroup should be a valid slice named as "xxx.slice".
> See '/usr/bin/docker-current run --help'.
> Shell output: main : command provided 4
> main : run as user is johndoe
> main : requested yarn user is johndoe
> {noformat}
> Solution: switch to cgroupfs. Doing so can be OS-specific, but we can 
> document a {{systemcl}} example.
>  
> *#2: {{/bin/bash}} must be present on the {{$PATH}} inside the container*
> Some smaller images like "busybox" or "alpine" does not have {{/bin/bash}}. 
> It's because all commands under {{/bin}} are linked to {{/bin/busybox}} and 
> there's only {{/bin/sh}}.
> If we try to use these kind of images, we'll see the following error message:
> {noformat}
> Container id: container_1561638268473_0015_01_02
> Exit code: 7
> Exception message: Launch container failed
> Shell error output: /usr/bin/docker-current: Error response from daemon: oci 
> runtime error: container_linux.go:235: starting container process caused 
> "exec: \"bash\": executable file not found in $PATH".
> Shell output: main : command provided 4
> main : run as user is johndoe
> main : requested yarn user is johndoe
> {noformat}
>  
> *#3: {{find}} command must be available on the {{$PATH}}*
> It seems obvious that we have the {{find}} command, but even very popular 
> images like {{fedora}} requires that we install it separately.
> If we don't have {{find}} available, then {{launcher_container.sh}} fails 
> with:
> {noformat}
> [2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. 
> Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> /tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_02/launch_container.sh:
>  line 44: find: command not found
> Last 4096 bytes of stderr.txt :
> [2019-07-01 03:51:25.053]Container exited with a non-zero exit code 127. 
> Error file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> /tmp/hadoop-systest/nm-local-dir/usercache/systest/appcache/application_1561638268473_0017/container_1561638268473_0017_01_02/launch_container.sh:
>  line 44: find: command not found
> Last 4096 bytes of stderr.txt :
> {noformat}
> *#4 Add cmd-line example of how to tag local images*
> This is actually documented under "Privileged Container Security 
> Consideration", but an one-liner would be helpful. I had trouble running a 
> local docker image and tagging it appropriately. Just an example like 
> {{docker tag local_ubuntu local/ubuntu:latest}} is already very informative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880280#comment-16880280
 ] 

Peter Bacsko commented on YARN-9667:


cc [~shaneku...@gmail.com] also

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout

2019-07-08 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880544#comment-16880544
 ] 

Eric Yang commented on YARN-9667:
-

[~pbacsko] There is duplicated Eric Yang in jira that is a different person.  
Please make sure that you use the correct user name.  The proposed solution is 
sounding.  I think making flush before fork is good.  We don't want to flush on 
every log entry, otherwise, it reduces performance.  Thanks

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880552#comment-16880552
 ] 

Peter Bacsko commented on YARN-9667:


Oh, thanks for saying - I didn't see that it's not you.

Anyway, I recommended #1 because already have a bunch of {{fprintf}}/{{fflush}} 
pair scattered around the code. So it's repetitive code that we should get rid 
of. I don't know whether we flush at all {{fprintf}} but we do it quite often. 
I was under the impression that it's deliberate - and if so, we might as well 
disable stdio buffering completely. But it's something that we can discuss 
online.

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880582#comment-16880582
 ] 

Peter Bacsko commented on YARN-9667:


I just found that this is probably even better: {{setvbuf(ERRORFILE, NULL, 
_IOLBF, BUFSIZ); }}. This makes stderr line buffered, which is probably what we 
want.

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-9667) Container-executor.c duplicates messages to stdout



[ 
https://issues.apache.org/jira/browse/YARN-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880582#comment-16880582
 ] 

Peter Bacsko edited comment on YARN-9667 at 7/8/19 5:50 PM:


I just found that this is probably even better: {{setvbuf(ERRORFILE, NULL, 
_IOLBF, BUFSIZ);}}. This makes stderr line buffered, which is probably what we 
want.


was (Author: pbacsko):
I just found that this is probably even better: {{setvbuf(ERRORFILE, NULL, 
_IOLBF, BUFSIZ); }}. This makes stderr line buffered, which is probably what we 
want.

> Container-executor.c duplicates messages to stdout
> --
>
> Key: YARN-9667
> URL: https://issues.apache.org/jira/browse/YARN-9667
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> When a container is killed by its AM we get a similar error message like this:
> {noformat}
> 2019-06-30 12:09:04,412 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 143. Privileged Execution Operation 
> Stderr:
> Stdout: main : command provided 1
> main : run as user is systest
> main : requested yarn user is systest
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file 
> /yarn/nm/nmPrivate/application_1561921629886_0001/container_e84_1561921629886_0001_01_19/container_e84_1561921629886_0001_01_19.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> {noformat}
> In container-executor.c the fork point is right after the "Creating script 
> paths..." part, though in the Stdout log we can clearly see it has been 
> written there twice. After consulting with [~pbacsko] it seems like there's a 
> missing flush in container-executor.c before the fork and that causes the 
> duplication.
> I suggest to add a flush there so that it won't be duplicated: it's a bit 
> misleading that the child process writes out "Getting exit code file" and 
> "Creating script paths" even though it is clearly not doing that.
> A more appealing solution could be to revisit the fprintf-fflush pairs in the 
> code and change them to a single call, so that the fflush calls would not be 
> forgotten accidentally. (It can cause problems in every place where it's 
> used).
> Note: this issue probably affects every occasion of fork(), not just the one 
> from {{launch_container_as_user}} in {{main.c}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup

Jonathan Hung created YARN-9668:
---

 Summary: UGI conf doesn't read user overridden configurations on 
RM and NM startup
 Key: YARN-9668
 URL: https://issues.apache.org/jira/browse/YARN-9668
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jonathan Hung
Assignee: Jonathan Hung


Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



 [ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9668:

Attachment: YARN-9668.001.patch

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



 [ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9668:

Affects Version/s: 2.10.0

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-07-08 Thread Tak Lon (Stephen) Wu (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880605#comment-16880605
 ] 

Tak Lon (Stephen) Wu commented on YARN-6492:


Hey guys, is this Jira still pending in progress? in fact, I tried without and 
with this latest commit on 3.1.0 with partitions/labels, some metrics e.g. 
PendingMB PendingVCores PendingContainers will be non-zero after an (wordcount) 
application has been successfully running ([positive without this 
change|https://drive.google.com/file/d/1ScktBI2-HibT9beOL7TGJkta7efV1cXD/view?usp=sharing],
 and [mixed of negative and positive with this 
change|https://drive.google.com/file/d/1B4dqVtKcZV7568DEJUG4pRsB8sWe-n4y/view?usp=sharing])
 

I'm looking into it but want to know if anyone is seeing the same problem, and 
hopefully if this Jira continues with a right solution, above problem could be 
fixed 

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6492) Generate queue metrics for each partition



[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880613#comment-16880613
 ] 

Hadoop QA commented on YARN-6492:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} YARN-6492 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6492 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24348/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup

2019-07-08 Thread Haibo Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880618#comment-16880618
 ] 

Haibo Chen commented on YARN-9668:
--

Thanks for the patch, [~jhung]. I have a minor comment.

Can we use expectedException instead?  nodemanager.stop() can be wrapped in a 
finally block as such. 

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



 [ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9668:

Attachment: YARN-9668.002.patch

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch, YARN-9668.002.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880643#comment-16880643
 ] 

Jonathan Hung commented on YARN-9668:
-

Thanks [~haibochen] makes sense, attached 002 to address this.

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch, YARN-9668.002.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup

2019-07-08 Thread Haibo Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880717#comment-16880717
 ] 

Haibo Chen commented on YARN-9668:
--

+1 pending Jenkins.

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch, YARN-9668.002.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880719#comment-16880719
 ] 

Hadoop QA commented on YARN-9668:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
43s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 43s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA |
|   | hadoop.yarn.server.resourcemanager.TestRMStoreCommands |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9668 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973971/YARN-9668.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 008c51d721f9 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchp

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880733#comment-16880733
 ] 

Hadoop QA commented on YARN-9668:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
12s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 44s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}160m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9668 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973973/YARN-9668.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 801fbe5b8c06 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / de6b7bc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apa

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



[ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880825#comment-16880825
 ] 

Hadoop QA commented on YARN-9668:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
18s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 51s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9668 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973973/YARN-9668.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cb09bb301852 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/pro

[jira] [Updated] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup



 [ 
https://issues.apache.org/jira/browse/YARN-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9668:

Attachment: YARN-9668.003.patch

> UGI conf doesn't read user overridden configurations on RM and NM startup
> -
>
> Key: YARN-9668
> URL: https://issues.apache.org/jira/browse/YARN-9668
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9668.001.patch, YARN-9668.002.patch, 
> YARN-9668.003.patch
>
>
> Similar to HADOOP-15150. Configs overridden thru e.g. -D or -conf are not 
> passed to the UGI conf on RM or NM startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9668) UGI conf doesn't read user overridden configurations on RM and NM startup