[jira] [Commented] (YARN-8377) Javadoc build failed in hadoop-yarn-server-nodemanager

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494727#comment-16494727
 ] 

genericqa commented on YARN-8377:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 41s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 23s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8377 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925673/YARN-8377.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cea1d9d432ca 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5f6769f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20891/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20891/testReport/ |
| Max. process+thread count | 410 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn

[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494726#comment-16494726
 ] 

genericqa commented on YARN-8375:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 39s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m  0s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestCGroupElasticMemoryController
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8375 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925674/YARN-8375.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d7f737df3b8b 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5f6769f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20892/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20892/testReport/ |
| Max. process+thread count | 408 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-proj

[jira] [Commented] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-29 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494690#comment-16494690
 ] 

Rohith Sharma K S commented on YARN-8372:
-

DS app master should handle shutdown request properly whether to clean up or 
not based on the attempt number check. Current behavior clean up all the 
running containers!

> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Priority: Major
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494684#comment-16494684
 ] 

Miklos Szegedi commented on YARN-8375:
--

Submitting a patch with timeouts to narrow down the failing test case. It does 
not repro locally.

> TestCGroupElasticMemoryController fails surefire build
> --
>
> Key: YARN-8375
> URL: https://issues.apache.org/jira/browse/YARN-8375
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-8375.000.patch
>
>
> hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
> recently because TestCGroupElasticMemoryController is either exiting or 
> timing out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-8375:
-
Attachment: YARN-8375.000.patch

> TestCGroupElasticMemoryController fails surefire build
> --
>
> Key: YARN-8375
> URL: https://issues.apache.org/jira/browse/YARN-8375
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-8375.000.patch
>
>
> hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
> recently because TestCGroupElasticMemoryController is either exiting or 
> timing out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8369) Javadoc build failed due to "bad use of '>'"

2018-05-29 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494678#comment-16494678
 ] 

Takanobu Asanuma commented on YARN-8369:


Sorry, there is one more same error. Filed it in YARN-8377.

> Javadoc build failed due to "bad use of '>'"
> 
>
> Key: YARN-8369
> URL: https://issues.apache.org/jira/browse/YARN-8369
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, docs
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8369.1.patch, YARN-8369.2.patch
>
>
> {noformat}
> $ mvn javadoc:javadoc --projects 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
> ...
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:263:
>  error: bad use of '>'
> [ERROR]* included) has a >0 value.
> [ERROR]  ^
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:266:
>  error: bad use of '>'
> [ERROR]* @return returns true if any resource is >0
> [ERROR]  ^
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8377) Javadoc build failed in hadoop-yarn-server-nodemanager

2018-05-29 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494672#comment-16494672
 ] 

Takanobu Asanuma commented on YARN-8377:


I'm sorry that I didn't find the error in YARN-8369.

Uploaded the 1st patch. I've just confirmed that {{mvn clean package 
-Pdist,native -Dtar -DskipTests}} succeeded with the patch.

> Javadoc build failed in hadoop-yarn-server-nodemanager
> --
>
> Key: YARN-8377
> URL: https://issues.apache.org/jira/browse/YARN-8377
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, docs
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
> Attachments: YARN-8377.1.patch
>
>
> This is the same cause as YARN-8369.
> {code}
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java:88:
>  error: bad use of '>'
> [ERROR]* When failuresValidityInterval is > 0, it also removes time 
> entries from
> [ERROR]   ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8377) Javadoc build failed in hadoop-yarn-server-nodemanager

2018-05-29 Thread Takanobu Asanuma (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated YARN-8377:
---
Attachment: YARN-8377.1.patch

> Javadoc build failed in hadoop-yarn-server-nodemanager
> --
>
> Key: YARN-8377
> URL: https://issues.apache.org/jira/browse/YARN-8377
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, docs
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
> Attachments: YARN-8377.1.patch
>
>
> This is the same cause as YARN-8369.
> {code}
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java:88:
>  error: bad use of '>'
> [ERROR]* When failuresValidityInterval is > 0, it also removes time 
> entries from
> [ERROR]   ^
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8377) Javadoc build failed in hadoop-yarn-server-nodemanager

2018-05-29 Thread Takanobu Asanuma (JIRA)
Takanobu Asanuma created YARN-8377:
--

 Summary: Javadoc build failed in hadoop-yarn-server-nodemanager
 Key: YARN-8377
 URL: https://issues.apache.org/jira/browse/YARN-8377
 Project: Hadoop YARN
  Issue Type: Bug
  Components: build, docs
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


This is the same cause as YARN-8369.

{code}
[ERROR] 
/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java:88:
 error: bad use of '>'
[ERROR]* When failuresValidityInterval is > 0, it also removes time entries 
from
[ERROR]   ^
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8320) [Umbrella] Support CPU isolation for latency-sensitive (LS) service

2018-05-29 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494656#comment-16494656
 ] 

Miklos Szegedi edited comment on YARN-8320 at 5/30/18 4:23 AM:
---

Thank you [~cheersyang] for the detailed response. The only thing that you are 
missing I think is that cpu and cpuset are not the same resource in cgroups. 
They are actually two dimensions of the CPU space. cpu,cpuacct controls in 
general how much time is allocated (one dimension) and cpuset controls how many 
physical devices are allocated (second dimension). cpu,cpuacct is a 
compressible, flexible resource more will almost always proportionally reduce 
the runtime if cpu bound. cpuset is is not flexible, it depends on the thread 
factor of the container.

Just to use your example above:
{code:java}
I have a NM with capacity:
  memory: 10gb
  vcore: 10
  cpus: 10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Request with just vcore number (the container runs a single process and single 
thread !):
  memory: 1gb
  vcore: 5

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore: 5
  cpus: 5 (0, 1, 2, 3, 4) WRONG(!) The process is single threaded, 4 cores are 
wasted.

Request with both vcore number and cpus:
  memory: 1gb
  vcore(cputime): 5
  cpuset: 1

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore(cputime): 5
  cpus: 5 (0, 1, 2, 3, 4, 5, 6, 7, 8) GOOD The process is single threaded.
{code}
I understand that you would like to simplify the configuration. However, as you 
see in the example the situation above will never be able to be solved by YARN 
anymore. This because of backward compatibility, if the current design is 
chosen.

That being said, if you still would like to follow the simplified path, please 
go ahead, I just wanted to elaborate my concerns.


was (Author: miklos.szeg...@cloudera.com):
Thank you [~cheersyang] for the detailed response. The only thing that you are 
missing I think is that cpu and cpuset are not the same resource in cgroups. 
They are actually two dimensions of the CPU space. cpu,cpuacct controls in 
general how much time is allocated (one dimension) and cpuset controls how many 
physical devices are allocated (second dimension). cpu,cpuacct is a 
compressible, flexible resource more will almost always proportionally reduce 
the runtime if cpu bound. cpuset is is not flexible, it depends on the thread 
factor of the container.

Just to use your example above:
{code:java}
I have a NM with capacity:
  memory: 10gb
  vcore: 10
  cpus: 10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Request with just vcore number (the container runs a single process and single 
thread !):
  memory: 1gb
  vcore: 5

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore: 5
  cpus: 5 (0, 1, 2, 3, 4) WRONG(!) The process is single threaded, 4 cores are 
wasted.

Request with both vcore number and cpus:
  memory: 1gb
  vcore(cputime): 5
  cpuset: 1

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore(cputime): 5
  cpus: 5 (0, 1, 2, 3, 4, 5, 6, 7, 8) GOOD The process is single threaded.
{code}
I understand that you would like to simplify the configuration. However, as you 
see in the example above the situation above will never be able to be solved by 
YARN anymore. This because of backward compatibility, if the current design is 
chosen.

That being said, if you still would like to follow the simplified path, please 
go ahead, I just wanted to elaborate my concerns.

> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> ---
>
> Key: YARN-8320
> URL: https://issues.apache.org/jira/browse/YARN-8320
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Jiandan Yang 
>Priority: Major
> Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, 
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and 
> “cpu.shares” to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; 
> no support for differentiated latency
>  * Request latency of services running on container may be frequent shake 
> when all containers share cpus, and latency-sensitive services can not afford 
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to 
> different processors, this is inspired by the isolation technique in [Borg 
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, 

[jira] [Commented] (YARN-8320) [Umbrella] Support CPU isolation for latency-sensitive (LS) service

2018-05-29 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494656#comment-16494656
 ] 

Miklos Szegedi commented on YARN-8320:
--

Thank you [~cheersyang] for the detailed response. The only thing that you are 
missing I think is that cpu and cpuset are not the same resource in cgroups. 
They are actually two dimensions of the CPU space. cpu,cpuacct controls in 
general how much time is allocated (one dimension) and cpuset controls how many 
physical devices are allocated (second dimension). cpu,cpuacct is a 
compressible, flexible resource more will almost always proportionally reduce 
the runtime if cpu bound. cpuset is is not flexible, it depends on the thread 
factor of the container.

Just to use your example above:
{code:java}
I have a NM with capacity:
  memory: 10gb
  vcore: 10
  cpus: 10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Request with just vcore number (the container runs a single process and single 
thread !):
  memory: 1gb
  vcore: 5

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore: 5
  cpus: 5 (0, 1, 2, 3, 4) WRONG(!) The process is single threaded, 4 cores are 
wasted.

Request with both vcore number and cpus:
  memory: 1gb
  vcore(cputime): 5
  cpuset: 1

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore(cputime): 5
  cpus: 5 (0, 1, 2, 3, 4, 5, 6, 7, 8) GOOD The process is single threaded.
{code}
I understand that you would like to simplify the configuration. However, as you 
see in the example above the situation above will never be able to be solved by 
YARN anymore. This because of backward compatibility, if the current design is 
chosen.

That being said, if you still would like to follow the simplified path, please 
go ahead, I just wanted to elaborate my concerns.

> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> ---
>
> Key: YARN-8320
> URL: https://issues.apache.org/jira/browse/YARN-8320
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Jiandan Yang 
>Priority: Major
> Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, 
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and 
> “cpu.shares” to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; 
> no support for differentiated latency
>  * Request latency of services running on container may be frequent shake 
> when all containers share cpus, and latency-sensitive services can not afford 
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to 
> different processors, this is inspired by the isolation technique in [Borg 
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Miklos Szegedi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-8373:
-
Labels: newbie  (was: )

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Priority: Major
>  Labels: newbie
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494646#comment-16494646
 ] 

Miklos Szegedi commented on YARN-8373:
--

This example samples the unallocated resource just once, so it should avoid the 
issue above. It is still not fully consistent though.
{code:java}
public Collection sortedNodeList(Comparator comparator) {
  SortedMap map = new TreeMap<>(comparator);
  readLock.lock();
  try {
for(N node: nodes.values()) {
  map.put(node.getUnallocatedResource(), node);
}
  } finally {
readLock.unlock();
  }
  return map.values();
}
{code}

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Priority: Major
>  Labels: newbie
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494637#comment-16494637
 ] 

Miklos Szegedi commented on YARN-8373:
--

It seems that SchedulerNode.unallocatedResource can change while we are sorting.

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Priority: Major
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-29 Thread Miklos Szegedi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494614#comment-16494614
 ] 

Miklos Szegedi commented on YARN-6677:
--

Indeed. How about a comparator with a condition?
 # It would compare and list opportunistic containers first to kill then 
guaranteed ones
 # If the container is in the opportunistic group it would list ones with later 
launch time regardless of bursting to preserve pre-oversubscription behavior.
 # If the container is in the guaranteed group it would list the ones bursting 
first and then the ones with later launch time.

 

 

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8369) Javadoc build failed due to "bad use of '>'"

2018-05-29 Thread Takanobu Asanuma (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494441#comment-16494441
 ] 

Takanobu Asanuma commented on YARN-8369:


Thanks for committing it, [~leftnoteasy]!

> Javadoc build failed due to "bad use of '>'"
> 
>
> Key: YARN-8369
> URL: https://issues.apache.org/jira/browse/YARN-8369
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, docs
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8369.1.patch, YARN-8369.2.patch
>
>
> {noformat}
> $ mvn javadoc:javadoc --projects 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
> ...
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:263:
>  error: bad use of '>'
> [ERROR]* included) has a >0 value.
> [ERROR]  ^
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:266:
>  error: bad use of '>'
> [ERROR]* @return returns true if any resource is >0
> [ERROR]  ^
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8343) YARN should have ability to run images only from a whitelist docker registries

2018-05-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494370#comment-16494370
 ] 

Eric Yang commented on YARN-8343:
-

I opened a duplicated JIRA by accident.  The other JIRA has better description 
for the problem to solve.  This jira can be closed as a duplicated.

> YARN should have ability to run images only from a whitelist docker registries
> --
>
> Key: YARN-8343
> URL: https://issues.apache.org/jira/browse/YARN-8343
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Critical
>  Labels: Docker
>
> This is a superset of docker.privileged-containers.registries, admin can 
> specify a whitelist and all images from non-privileged-container.registries 
> will be rejected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8068) Application Priority field causes NPE in app timeline publish when Hadoop 2.7 based clients to 2.8+

2018-05-29 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-8068:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-8347)

> Application Priority field causes NPE in app timeline publish when Hadoop 2.7 
> based clients to 2.8+
> ---
>
> Key: YARN-8068
> URL: https://issues.apache.org/jira/browse/YARN-8068
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.3
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Blocker
> Fix For: 3.1.0, 2.10.0, 2.9.2, 3.0.3
>
> Attachments: YARN-8068.001.patch
>
>
> [TimelineServiceV1Publisher|eclipse-javadoc:%E2%98%82=hadoop-yarn-server-resourcemanager/src%5C/main%5C/java%3Corg.apache.hadoop.yarn.server.resourcemanager.metrics%7BTimelineServiceV1Publisher.java%E2%98%83TimelineServiceV1Publisher].appCreated
>  will cause NPE as we use like below
> {code:java}
> entityInfo.put(ApplicationMetricsConstants.APPLICATION_PRIORITY_INFO, 
> app.getApplicationPriority().getPriority());{code}
> We have to handle this case while recovery.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-29 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-8346:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-8347)

> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 
>
> Key: YARN-8346
> URL: https://issues.apache.org/jira/browse/YARN-8346
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 3.1.0, 2.10.0, 3.2.0, 2.9.2, 3.0.3
>
> Attachments: YARN-8346.001.patch
>
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_01] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8347) [Umbrella] Upgrade efforts to Hadoop 3.x

2018-05-29 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494356#comment-16494356
 ] 

Vinod Kumar Vavilapalli commented on YARN-8347:
---

Thanks for filing this, [~sunilg]!

I think this is better tracked as a Hadoop common jira so that we can cover all 
relevant topics in HDFS, MapReduce as well. Moving this to HADOOP Common.

Let's also link all JIRA that we are finding on the way - I'll link some myself.

We should eventually have a wiki page and site docs documenting starting 
version (2.7.x, 2.8.x etc) and the destination version (3.0.x, 3.1.x etc) and 
what tests are done, what issues we discover, what API changes users should do, 
configuration & script migration by admins etc.

> [Umbrella] Upgrade efforts to Hadoop 3.x
> 
>
> Key: YARN-8347
> URL: https://issues.apache.org/jira/browse/YARN-8347
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sunil Govindan
>Priority: Major
>
> This is an umbrella ticket to manage all similar efforts to close gaps for 
> upgrade efforts to 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8362) Number of remaining retries are updated twice after a container failure in NM

2018-05-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494344#comment-16494344
 ] 

Hudson commented on YARN-8362:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14312 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14312/])
YARN-8362.  Bugfix logic in container retries in node manager.   
(eyang: rev 135941e00d762a417c3b4cc524cdc59b0d1810b1)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestSlidingWindowRetryPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/SlidingWindowRetryPolicy.java


> Number of remaining retries are updated twice after a container failure in NM 
> --
>
> Key: YARN-8362
> URL: https://issues.apache.org/jira/browse/YARN-8362
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8362.001.patch, YARN-8362.002.patch
>
>
> The {{shouldRetry(int errorCode)}} in {{ContainerImpl}} with YARN-5015 also 
> updated some fields in retry context- remaining retries, restart times.
> This method is directly called from outside the ContainerImpl class as well- 
> {{ContainerLaunch.setContainerCompletedStatus}}. This causes following 
> problems:
>  # remainingRetries are updated more than once after a failure. if 
> {{maxRetries = 1}}, then a retry will not be triggered because of multiple 
> calls to {{shouldRetry(int errorCode).}}
>  # Writes to {{retryContext}} should be protected and called when the write 
> lock is held.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-29 Thread Gour Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha updated YARN-8308:

Target Version/s: 3.1.1

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018

[jira] [Resolved] (YARN-8309) Diagnostic message for yarn service app failure due token renewal should be improved

2018-05-29 Thread Gour Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha resolved YARN-8309.
-
Resolution: Won't Do

> Diagnostic message for yarn service app failure due token renewal should be 
> improved
> 
>
> Key: YARN-8309
> URL: https://issues.apache.org/jira/browse/YARN-8309
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Priority: Major
>
> When Yarn service application failed due to token renewal issue , The 
> diagonstic message was unclear . 
> {code:java}
> Application application_1526413043392_0002 failed 20 times due to AM 
> Container for appattempt_1526413043392_0002_20 exited with exitCode: 1 
> Failing this attempt.Diagnostics: [2018-05-15 23:15:28.779]Exception from 
> container-launch. Container id: container_e04_1526413043392_0002_20_01 
> Exit code: 1 Exception message: Launch container failed Shell output: main : 
> command provided 1 main : run as user is hbase main : requested yarn user is 
> hbase Getting exit code file... Creating script paths... Writing pid file... 
> Writing to tmp file 
> /grid/0/hadoop/yarn/local/nmPrivate/application_1526413043392_0002/container_e04_1526413043392_0002_20_01/container_e04_1526413043392_0002_20_01.pid.tmp
>  Writing to cgroup task files... Creating local dirs... Launching 
> container... Getting exit code file... Creating script paths... [2018-05-15 
> 23:15:28.806]Container exited with a non-zero exit code 1. Error file: 
> prelaunch.err. Last 4096 bytes of prelaunch.err : [2018-05-15 
> 23:15:28.807]Container exited with a non-zero exit code 1. Error file: 
> prelaunch.err. Last 4096 bytes of prelaunch.err : For more detailed output, 
> check the application tracking page: 
> https://xxx:8090/cluster/app/application_1526413043392_0002 Then click on 
> links to logs of each attempt. . Failing the application.{code}
> Here, diagnostic message should be improved to specify that AM is failing due 
> to token renewal issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8309) Diagnostic message for yarn service app failure due token renewal should be improved

2018-05-29 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494326#comment-16494326
 ] 

Gour Saha commented on YARN-8309:
-

Once a fix for YARN-8308 is provided this diagnostics message fix won't be 
required. In fact, from the code perspective, the phase at which the token 
issue occurs, ATSv2 publisher initialization and RM registration cannot be 
done. So technically diagnostics message cannot be enhanced by AM.

> Diagnostic message for yarn service app failure due token renewal should be 
> improved
> 
>
> Key: YARN-8309
> URL: https://issues.apache.org/jira/browse/YARN-8309
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Priority: Major
>
> When Yarn service application failed due to token renewal issue , The 
> diagonstic message was unclear . 
> {code:java}
> Application application_1526413043392_0002 failed 20 times due to AM 
> Container for appattempt_1526413043392_0002_20 exited with exitCode: 1 
> Failing this attempt.Diagnostics: [2018-05-15 23:15:28.779]Exception from 
> container-launch. Container id: container_e04_1526413043392_0002_20_01 
> Exit code: 1 Exception message: Launch container failed Shell output: main : 
> command provided 1 main : run as user is hbase main : requested yarn user is 
> hbase Getting exit code file... Creating script paths... Writing pid file... 
> Writing to tmp file 
> /grid/0/hadoop/yarn/local/nmPrivate/application_1526413043392_0002/container_e04_1526413043392_0002_20_01/container_e04_1526413043392_0002_20_01.pid.tmp
>  Writing to cgroup task files... Creating local dirs... Launching 
> container... Getting exit code file... Creating script paths... [2018-05-15 
> 23:15:28.806]Container exited with a non-zero exit code 1. Error file: 
> prelaunch.err. Last 4096 bytes of prelaunch.err : [2018-05-15 
> 23:15:28.807]Container exited with a non-zero exit code 1. Error file: 
> prelaunch.err. Last 4096 bytes of prelaunch.err : For more detailed output, 
> check the application tracking page: 
> https://xxx:8090/cluster/app/application_1526413043392_0002 Then click on 
> links to logs of each attempt. . Failing the application.{code}
> Here, diagnostic message should be improved to specify that AM is failing due 
> to token renewal issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-29 Thread Gour Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gour Saha reassigned YARN-8308:
---

Assignee: Gour Saha

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2

[jira] [Commented] (YARN-8308) Yarn service app fails due to issues with Renew Token

2018-05-29 Thread Gour Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494324#comment-16494324
 ] 

Gour Saha commented on YARN-8308:
-

will provide a patch for this issue

> Yarn service app fails due to issues with Renew Token
> -
>
> Key: YARN-8308
> URL: https://issues.apache.org/jira/browse/YARN-8308
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Yesha Vora
>Assignee: Gour Saha
>Priority: Major
>
> Run Yarn service application beyond 
> dfs.namenode.delegation.token.max-lifetime. 
> Here, yarn service application fails with below error. 
> {code}
> 2018-05-15 23:14:35,652 [main] WARN  ipc.Client - Exception encountered while 
> connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
> 2018-05-15 23:14:35,654 [main] INFO  service.AbstractService - Service 
> Service Master failed in state INITED
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 
> sequenceNumber=7, masterKeyId=8) is expired, current time: 2018-05-15 
> 23:14:35,651+ expected renewal time: 2018-05-15 23:09:59,164+
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:883)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1654)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>   at 
> org.apache.hadoop.yarn.service.utils.JsonSerDeser.load(JsonSerDeser.java:182)
>   at 
> org.apache.hadoop.yarn.service.utils.ServiceApiUtil.loadServiceFrom(ServiceApiUtil.java:337)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.loadApplicationJson(ServiceMaster.java:242)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceInit(ServiceMaster.java:91)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:316)
> 2018-05-15 23:14:35,659 [main] INFO  service.ServiceMaster - Stopping app 
> master
> 2018-05-15 23:14:35,660 [main] ERROR service.ServiceMaster - Error starting 
> service master
> org.apache.hadoop.service.ServiceStateException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for hbase: HDFS_DELEGATION_TOKEN owner=hbase, renewer=yarn, 
> realUser=rm/x...@example.com, issueDate=1526423999164, maxDate=1526425799164, 

[jira] [Commented] (YARN-8329) Docker client configuration can still be set incorrectly

2018-05-29 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494267#comment-16494267
 ] 

Shane Kumpf commented on YARN-8329:
---

Thanks for the review and commit [~jlowe]!

> Docker client configuration can still be set incorrectly
> 
>
> Key: YARN-8329
> URL: https://issues.apache.org/jira/browse/YARN-8329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>  Labels: Docker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8329.001.patch, YARN-8329.002.patch
>
>
> YARN-7996 implemented a fix to avoid writing an empty Docker client 
> configuration file, but there are still cases where the {{docker --config}} 
> argument is set in error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8329) Docker client configuration can still be set incorrectly

2018-05-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494223#comment-16494223
 ] 

Hudson commented on YARN-8329:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14311 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14311/])
YARN-8329. Docker client configuration can still be set incorrectly. (jlowe: 
rev 4827e9a9085b306bc379cb6e0b1fe4b92326edcd)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/DockerClientConfigHandler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestDockerClientConfigHandler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java


> Docker client configuration can still be set incorrectly
> 
>
> Key: YARN-8329
> URL: https://issues.apache.org/jira/browse/YARN-8329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>  Labels: Docker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8329.001.patch, YARN-8329.002.patch
>
>
> YARN-7996 implemented a fix to avoid writing an empty Docker client 
> configuration file, but there are still cases where the {{docker --config}} 
> argument is set in error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494221#comment-16494221
 ] 

Íñigo Goiri commented on YARN-8359:
---

Thanks [~giovanni.fumarola] for checking.
The {{TestContainerManager}} fails randomly on the second run of  
[^YARN-8359.002.patch] but in any case, I think this is good to go.
[~jlowe], do you mind committing?

> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch, YARN-8359.002.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8329) Docker client configuration can still be set incorrectly

2018-05-29 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494133#comment-16494133
 ] 

Jason Lowe commented on YARN-8329:
--

Thanks for updating the patch!  The unit test failure is unrelated, see 
YARN-8375.

+1 for the latest patch.  Committing this.

> Docker client configuration can still be set incorrectly
> 
>
> Key: YARN-8329
> URL: https://issues.apache.org/jira/browse/YARN-8329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8329.001.patch, YARN-8329.002.patch
>
>
> YARN-7996 implemented a fix to avoid writing an empty Docker client 
> configuration file, but there are still cases where the {{docker --config}} 
> argument is set in error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8376) Separate white list for docker.trusted.registries and docker.privileged-container.registries

2018-05-29 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8376:

Issue Type: Sub-task  (was: Improvement)
Parent: YARN-3611

> Separate white list for docker.trusted.registries and 
> docker.privileged-container.registries
> 
>
> Key: YARN-8376
> URL: https://issues.apache.org/jira/browse/YARN-8376
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: docker
>
> In the ideal world, it would be possible to have separate white lists for 
> docker registry depending on the security requirement for each type of docker 
> images:
> 1. Registries from which we can run non-privileged containers without mounts
> 2. Registries from which we can run non-privileged containers with mounts
> 3. Registries from which we can run privileged or non-privileged containers 
> with mounts
> In the current implementation, there are only type 1 and type 2 or 3.  It 
> would be nice to definite a separate white list to differentiate between 2 
> and 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-29 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494123#comment-16494123
 ] 

Eric Badger commented on YARN-8342:
---

{quote}This is progression improvement that can be enhanced to further lock 
down privileged registry when the demand arises. I opened YARN-8376 to track 
the separation of white lists to avoid confusions. At this time, we will label 
type 2 and 3 as docker.trusted.registries. In YARN-8376, we can label type 2 as 
docker.trusted.registries, and type 3 as docker.privileged-container.registries.
{quote}
That sounds good to me. +1 for this approach

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494121#comment-16494121
 ] 

genericqa commented on YARN-8359:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
39m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 35m 17s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925598/YARN-8359.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux aba12c9ac0da 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9502b47 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20890/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20890/testReport/ |
| Max. process+thread count | 335 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20890/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Exclude containermanager.linux test classes on

[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494120#comment-16494120
 ] 

Eric Yang commented on YARN-8342:
-

[~ebadger] {quote}
You have high confidence in everything in this registry and therefore are 
willing to let these images be run as privileged. With a single list for 
registries (with mounts), I believe this use case would be impossible.{quote}

I agree this is a possible area for improvement.

{quote}
I agree with the launch_command change. As for the registries label change, it 
would be nice to have a plan in place for how we're going to tackle this to 
make it less confusing. However, I'm also ok making that a separate change in a 
different JIRA.
{quote}

This is progression improvement that can be enhanced to further lock down 
privileged registry when the demand arises.  I opened YARN-8376 to track the 
separation of white lists to avoid confusions.  At this time, we will label 
type 2 and 3 as docker.trusted.registries.  In YARN-8376, we can label type 2 
as docker.trusted.registries, and type 3 as 
docker.privileged-container.registries.

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8324) Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494122#comment-16494122
 ] 

genericqa commented on YARN-8324:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 35s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 19s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8324 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925597/YARN-8324.v3.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a3d35796f64a 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9502b47 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20889/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20889/testReport/ |
| Max. process+thread count | 333 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20889/console

[jira] [Commented] (YARN-8324) Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494116#comment-16494116
 ] 

genericqa commented on YARN-8324:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m 57s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8324 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925597/YARN-8324.v3.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 85bce4def567 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9502b47 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20887/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20887/testReport/ |
| Max. process+thread count | 335 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20887/console

[jira] [Created] (YARN-8376) Separate white list for docker.trusted.registries and docker.privileged-container.registries

2018-05-29 Thread Eric Yang (JIRA)
Eric Yang created YARN-8376:
---

 Summary: Separate white list for docker.trusted.registries and 
docker.privileged-container.registries
 Key: YARN-8376
 URL: https://issues.apache.org/jira/browse/YARN-8376
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Eric Yang


In the ideal world, it would be possible to have separate white lists for 
docker registry depending on the security requirement for each type of docker 
images:

1. Registries from which we can run non-privileged containers without mounts
2. Registries from which we can run non-privileged containers with mounts
3. Registries from which we can run privileged or non-privileged containers 
with mounts

In the current implementation, there are only type 1 and type 2 or 3.  It would 
be nice to definite a separate white list to differentiate between 2 and 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494105#comment-16494105
 ] 

genericqa commented on YARN-8359:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
39m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 33m 59s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925598/YARN-8359.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux e019420dd2b4 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9502b47 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20888/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20888/testReport/ |
| Max. process+thread count | 336 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20888/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: 

[jira] [Commented] (YARN-7953) [GQ] Data structures for federation global queues calculations

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494102#comment-16494102
 ] 

genericqa commented on YARN-7953:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 36s{color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 22 unchanged - 0 fixed = 23 total (was 22) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 7 new + 0 unchanged - 0 fixed = 7 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
12s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 5 new + 0 unchanged - 0 fixed = 5 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
24s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 41s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Possible null pointer dereference of f in 
org.apache.hadoop.yarn.server.resourcemanager.federation.globalqueues.FedQueue.recursiveChildrenByName(FedQueue,
 String)  Dereferenced at FedQueue.java:f in 
org.apache.hadoop.yarn.server.resourcemanager.federation.globalqueues.FedQueue.recursiveChildrenByName(FedQueue,
 String)  Dereferenced at FedQueue.java:[line 379] |
|  |  Nullcheck of FedQueue.children at line 162 of value previously 
dereferenced in 
org.apache.hadoop.yarn.server.resourcemanager.federatio

[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494077#comment-16494077
 ] 

Giovanni Matteo Fumarola commented on YARN-8359:


Yes, the entire block containermanager.linux is skipped.

The test execution goes from launcher to localizer.

[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.*launcher*.TestContainersLauncher
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.601 s 
- in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainersLauncher
[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.*localizer*.sharedcache.TestSharedCacheUploader
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.785 s 
- in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.sharedcache.TestSharedCacheUploader

> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch, YARN-8359.002.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8370) Some nodemanager tests fail on Windows due to improper path/file separator

2018-05-29 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493997#comment-16493997
 ] 

Giovanni Matteo Fumarola commented on YARN-8370:


Let's hold on this one. YARN-8359 disables the whole test set for Windows.

> Some nodemanager tests fail on Windows due to improper path/file separator
> --
>
> Key: YARN-8370
> URL: https://issues.apache.org/jira/browse/YARN-8370
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Anbang Hu
>Assignee: Anbang Hu
>Priority: Minor
>  Labels: Windows
> Attachments: YARN-8370.000.patch
>
>
> Following nodemanager tests fail on Windows due to path/file separator issue:
> * 
> [TestPrivilegedOperationExecutor#testExecutorPath|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged/TestPrivilegedOperationExecutor/testExecutorPath/]
> * 
> [TestLocalDirsHandlerService#testGetFullDirs|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager/TestLocalDirsHandlerService/testGetFullDirs/]
> * 
> [TestAppLogAggregatorImpl|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation/TestAppLogAggregatorImpl/]
> * 
> [TestCGroupsHandlerImpl#testCGroupOperations|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testCGroupOperations/]
> * 
> [TestCGroupsHandlerImpl#testMtabParsing|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testMtabParsing/]
> * 
> [TestCGroupsHandlerImpl#testCGroupPaths|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testCGroupPaths/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-29 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493990#comment-16493990
 ] 

Eric Badger commented on YARN-8342:
---

bq. Sudo users can easily change the configuration to allow the untrusted 
registry to become trusted. It would be very difficult to prevent sudo users 
from untrusted registries. This is a procedure problem rather than coding 
problem.
Certainly. If the user is a sudo user then they would be able to change the 
configuration. Again, my idea behind this is to decrease the potential attack 
surface. More to give a user an easy way to run an untrusted image with a 
smaller scope. If they truly wanted to run the image full-bore, then they 
absolutely could. For example, I could want to try out some random image from 
dockerhub that isn't trusted to me at all. Therefore, I don't want to run it in 
the same way in which I run images from my trusted internal registry with 
trusted and audited images. 

bq. Let's make sure we agree on the required code fix. If 
docker.privileged-containers.enabled is disabled, and user put images in 
docker.trusted.registries. The images in docker.trusted.registries behaves like 
type 2. When docker.privileged-containers.enabled is enabled, and user put 
images in docker.trusted.registries, images behaves like type 3. Registries not 
described in trusted registries are type 1 regardless of 
docker.privileged-containers.enabled setting. Hence, the 
docker.privileged-container.registries renamed to docker.trusted.registries can 
address the confusion.
Yes, for a single registry this would work. However, the use case is when you 
have multiple different registries that you would like to treat differently. If 
you have a single docker.trusted.registries list, then either all or none of 
the registries that you add to docker.trusted.registries support privileged 
containers. However, there could be a separation between the two. For example, 
there could be a company-wide registry that is "trusted", but you don't trust 
it enough to run those images as privileged. Then, you have your own 
team-specific registry that is trusted and is heavily audited by your team. You 
have high confidence in everything in this registry and therefore are willing 
to let these images be run as privileged. With a single list for registries 
(with mounts), I believe this use case would be impossible. 

bq. This JIRA is going to tweak type 1 to allow launch_command to be supplied 
and change docker.privilegd-containers.registries label. Do we agree this is 
the right safety valves and changes that are going to happen?
I agree with the launch_command change. As for the registries label change, it 
would be nice to have a plan in place for how we're going to tackle this to 
make it less confusing. However, I'm also ok making that a separate change in a 
different JIRA. 

bq. Eric Badger Btw, docker.privileged-containers.enabled is a 
container-executor.cfg property, not a user supplied setting. Is this how it is 
enforced on your side?
Yes, privileged containers as a whole are enabled/disabled based on a 
container-executor.cfg property. However, asking for them is specified by the 
user. 

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reassigned YARN-8375:


Assignee: Miklos Szegedi

> TestCGroupElasticMemoryController fails surefire build
> --
>
> Key: YARN-8375
> URL: https://issues.apache.org/jira/browse/YARN-8375
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Assignee: Miklos Szegedi
>Priority: Major
>
> hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
> recently because TestCGroupElasticMemoryController is either exiting or 
> timing out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493985#comment-16493985
 ] 

Haibo Chen commented on YARN-8375:
--

Thanks [~jlowe] for reporting. We'll take a look at this.

> TestCGroupElasticMemoryController fails surefire build
> --
>
> Key: YARN-8375
> URL: https://issues.apache.org/jira/browse/YARN-8375
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Priority: Major
>
> hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
> recently because TestCGroupElasticMemoryController is either exiting or 
> timing out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493973#comment-16493973
 ] 

Íñigo Goiri commented on YARN-8359:
---

The Yetus run for [^YARN-8359.001.patch] seems right.
[~giovanni.fumarola], did you confirm that this test now indeed gets ignored 
for Windows?

> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch, YARN-8359.002.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8324) Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows

2018-05-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493969#comment-16493969
 ] 

Íñigo Goiri commented on YARN-8324:
---

Let's hold on this one as YARN-8359 may disable the whole test set for Windows.

> Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows
> ---
>
> Key: YARN-8324
> URL: https://issues.apache.org/jira/browse/YARN-8324
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8324.v1.patch, YARN-8324.v2.patch, 
> YARN-8324.v3.patch, image-2018-05-18-14-42-56-314.png, 
> image-2018-05-18-14-43-09-321.png
>
>
> Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows because of 
> the path format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-29 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493963#comment-16493963
 ] 

Jason Lowe commented on YARN-4599:
--

TestCGroupElasticMemoryController is failing precommit builds recently, see 
YARN-8375. [~haibochen] [~miklos.szeg...@cloudera.com] can you take a look?

> Set OOM control for memory cgroups
> --
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Major
>  Labels: oct16-medium
> Fix For: 3.2.0
>
> Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, 
> YARN-4599.010.patch, YARN-4599.011.patch, YARN-4599.012.patch, 
> YARN-4599.013.patch, YARN-4599.014.patch, YARN-4599.015.patch, 
> YARN-4599.016.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493962#comment-16493962
 ] 

Jason Lowe commented on YARN-8375:
--

Example precommit log at 
https://builds.apache.org/job/PreCommit-YARN-Build/20884/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt

Relevant excerpts from that log (note the lack of test results from 
TestCGroupElasticMemoryController):
{noformat}
[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestCompareResourceCalculators
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.084 
s - in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestCompareResourceCalculators
[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestCGroupElasticMemoryController
[INFO] Running 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.TestFpgaResourceHandler
[INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.82 s - 
in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.TestFpgaResourceHandler
[...]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 33:58 min
[INFO] Finished at: 2018-05-29T17:30:40+00:00
[INFO] Final Memory: 23M/653M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-yarn-server-nodemanager: There was a timeout or other error in 
the fork -> [Help 1]
{noformat}


> TestCGroupElasticMemoryController fails surefire build
> --
>
> Key: YARN-8375
> URL: https://issues.apache.org/jira/browse/YARN-8375
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Priority: Major
>
> hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
> recently because TestCGroupElasticMemoryController is either exiting or 
> timing out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493957#comment-16493957
 ] 

genericqa commented on YARN-4781:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 24m 
55s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 8 new + 61 unchanged - 0 fixed = 69 total (was 61) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 
21s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}112m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:f667ef1 |
| JIRA Issue | YARN-4781 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925568/YARN-4781.005.branch-2.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a9eea3eb4e55 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 09fbbff |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_171 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20885/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20885/testReport/ |
| Max. process+thread count | 809 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20885/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Support intra-queue preemption for fairness ordering policy.
> --

[jira] [Created] (YARN-8375) TestCGroupElasticMemoryController fails surefire build

2018-05-29 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-8375:


 Summary: TestCGroupElasticMemoryController fails surefire build
 Key: YARN-8375
 URL: https://issues.apache.org/jira/browse/YARN-8375
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.2.0
Reporter: Jason Lowe


hadoop-yarn-server-nodemanager precommit builds have been failing unit tests 
recently because TestCGroupElasticMemoryController is either exiting or timing 
out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493956#comment-16493956
 ] 

Giovanni Matteo Fumarola commented on YARN-8359:


Thanks [~jlowe] for taking care of it.

+1 on v2.

> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch, YARN-8359.002.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493941#comment-16493941
 ] 

Jason Lowe commented on YARN-8359:
--

The unit test failure is unrelated.  I attached a new patch that fixes the tabs 
whitespace issue.


> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch, YARN-8359.002.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8359:
-
Attachment: YARN-8359.002.patch

> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch, YARN-8359.002.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8324) Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows

2018-05-29 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8324:
---
Attachment: YARN-8324.v3.patch

> Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows
> ---
>
> Key: YARN-8324
> URL: https://issues.apache.org/jira/browse/YARN-8324
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8324.v1.patch, YARN-8324.v2.patch, 
> YARN-8324.v3.patch, image-2018-05-18-14-42-56-314.png, 
> image-2018-05-18-14-43-09-321.png
>
>
> Fix TestPrivilegedOperationExecutor.testExecutorPath on Windows because of 
> the path format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-29 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493926#comment-16493926
 ] 

Haibo Chen commented on YARN-6677:
--

Thanks [~miklos.szeg...@cloudera.com] for the comment.
{quote}Could you just update DefaultOOMHandler with your code?
{quote}
This is doable, but will change the current DefaultOOMHandler behavior 
drastically. Specifically, in the case of OPPORTUNISTIC containers, they will 
be killed regardless of whether their individual usage is over their limit or 
not. We can preserve the behavior for GUARANTEED containers, without the two 
level sort comparator, as well. Let me know if that's okay with you.

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493903#comment-16493903
 ] 

genericqa commented on YARN-8359:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
38m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 3 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 34m 10s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925214/YARN-8359.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 02e7d6b1d746 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 438ef49 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/20884/artifact/out/whitespace-tabs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/20884/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20884/testReport/ |
| Max. process+thread count | 336 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20884/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Exclude containermanager.linux test classes on Windows
> -

[jira] [Assigned] (YARN-7953) [GQ] Data structures for federation global queues calculations

2018-05-29 Thread Subru Krishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan reassigned YARN-7953:


Assignee: Abhishek Modi  (was: Carlo Curino)

> [GQ] Data structures for federation global queues calculations
> --
>
> Key: YARN-7953
> URL: https://issues.apache.org/jira/browse/YARN-7953
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Carlo Curino
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-7953.v1.patch
>
>
> This Jira tracks data structures and helper classes used by the core 
> algorithms of YARN-7402 umbrella Jira (currently YARN-7403, and YARN-7834).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8370) Some nodemanager tests fail on Windows due to improper path/file separator

2018-05-29 Thread Anbang Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493888#comment-16493888
 ] 

Anbang Hu commented on YARN-8370:
-

Thanks [~miklos.szeg...@cloudera.com] and [~elgoiri] for the suggestion. It 
does make sense to disable these tests on Windows. Let's leave this JIRA and 
see how [YARN-8359|https://issues.apache.org/jira/browse/YARN-8359] goes.

> Some nodemanager tests fail on Windows due to improper path/file separator
> --
>
> Key: YARN-8370
> URL: https://issues.apache.org/jira/browse/YARN-8370
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Anbang Hu
>Assignee: Anbang Hu
>Priority: Minor
>  Labels: Windows
> Attachments: YARN-8370.000.patch
>
>
> Following nodemanager tests fail on Windows due to path/file separator issue:
> * 
> [TestPrivilegedOperationExecutor#testExecutorPath|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged/TestPrivilegedOperationExecutor/testExecutorPath/]
> * 
> [TestLocalDirsHandlerService#testGetFullDirs|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager/TestLocalDirsHandlerService/testGetFullDirs/]
> * 
> [TestAppLogAggregatorImpl|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation/TestAppLogAggregatorImpl/]
> * 
> [TestCGroupsHandlerImpl#testCGroupOperations|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testCGroupOperations/]
> * 
> [TestCGroupsHandlerImpl#testMtabParsing|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testMtabParsing/]
> * 
> [TestCGroupsHandlerImpl#testCGroupPaths|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testCGroupPaths/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8365) Revisit the record type used by Registry DNS for upstream resolution

2018-05-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493884#comment-16493884
 ] 

Eric Yang commented on YARN-8365:
-

Root record can still be fetched through ANY or SOA if upstream supports them:

{code}
dig @registry.dns.host . ANY
{code}

This is less aggressive than the default behavior.  Hence, the change should 
have negligible effect.

> Revisit the record type used by Registry DNS for upstream resolution
> 
>
> Key: YARN-8365
> URL: https://issues.apache.org/jira/browse/YARN-8365
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Attachments: YARN-8365.001.patch
>
>
> YARN-7326 leveraged the ANY record type for upstream resolution, but some 
> implementations [don't support 
> ANY|https://tools.ietf.org/html/draft-ietf-dnsop-refuse-any-06] due to the 
> potential for abuse, namely Cloudflare. Docker Hub leverages Cloudflare for 
> image distribution, so when Registry DNS is used as the sole resolver, docker 
> image downloads are failing. 
> {code:java}
> [root@host ~]# docker run -u root -it centos bash
> Unable to find image 'centos:latest' locally
> latest: Pulling from library/centos
> 469cfcc7a4b3: Already exists
> docker: error pulling image configuration: Get 
> https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/e9/e934aafc22064b7322c0250f1e32e5ce93b2d19b356f4537f5864bd102e8531f/data?verify=1527265495-nG8jk%2Bya9qrdPVlXRKGMnOhSnV0%3D:
>  dial tcp: lookup production.cloudflare.docker.com on registry.dns.host:53: 
> no such host.
> {code}
> {code:java}
> [root@host~]# nslookup production.cloudflare.docker.com registry.dns.host
> Server:   registry.dns.host
> Address:  registry.dns.host#53
> Non-authoritative answer:
> production.cloudflare.docker.com  hinfo = "ANY obsoleted" "See 
> draft-ietf-dnsop-refuse-any"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8359) Exclude containermanager.linux test classes on Windows

2018-05-29 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8359:
-
Summary: Exclude containermanager.linux test classes on Windows  (was: 
Disable containermanager.linux.runtime.TEST to run on Windows)

I updated the JIRA summary.  As for the Yetus run, I only recently moved this 
to Patch Available.  I had uploaded the patch as a proof-of-concept, but it 
looks like we can use it directly.  So the QA bot should be coming along soon.

> Exclude containermanager.linux test classes on Windows
> --
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8365) Revisit the record type used by Registry DNS for upstream resolution

2018-05-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493870#comment-16493870
 ] 

Eric Yang commented on YARN-8365:
-

[~shaneku...@gmail.com] Thanks for the patch.  This change will break recursion 
fix that was put in YARN-7326 based on Allen's comments, but I think this is 
fine to harden DNS to prevent [reflection 
attacks|https://www.us-cert.gov/ncas/alerts/TA13-088A].  I will commit this 
tomorrow, if no objections.

> Revisit the record type used by Registry DNS for upstream resolution
> 
>
> Key: YARN-8365
> URL: https://issues.apache.org/jira/browse/YARN-8365
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
>Priority: Major
> Attachments: YARN-8365.001.patch
>
>
> YARN-7326 leveraged the ANY record type for upstream resolution, but some 
> implementations [don't support 
> ANY|https://tools.ietf.org/html/draft-ietf-dnsop-refuse-any-06] due to the 
> potential for abuse, namely Cloudflare. Docker Hub leverages Cloudflare for 
> image distribution, so when Registry DNS is used as the sole resolver, docker 
> image downloads are failing. 
> {code:java}
> [root@host ~]# docker run -u root -it centos bash
> Unable to find image 'centos:latest' locally
> latest: Pulling from library/centos
> 469cfcc7a4b3: Already exists
> docker: error pulling image configuration: Get 
> https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/e9/e934aafc22064b7322c0250f1e32e5ce93b2d19b356f4537f5864bd102e8531f/data?verify=1527265495-nG8jk%2Bya9qrdPVlXRKGMnOhSnV0%3D:
>  dial tcp: lookup production.cloudflare.docker.com on registry.dns.host:53: 
> no such host.
> {code}
> {code:java}
> [root@host~]# nslookup production.cloudflare.docker.com registry.dns.host
> Server:   registry.dns.host
> Address:  registry.dns.host#53
> Non-authoritative answer:
> production.cloudflare.docker.com  hinfo = "ANY obsoleted" "See 
> draft-ietf-dnsop-refuse-any"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8369) Javadoc build failed due to "bad use of '>'"

2018-05-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493843#comment-16493843
 ] 

Hudson commented on YARN-8369:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14308 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14308/])
YARN-8369. Javadoc build failed due to 'bad use of >'. (Takanobu Asanuma 
(wangda: rev 17aa40f669f197d43387d67dc00040d14cd00948)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/CapacitySchedulerPreemptionUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java


> Javadoc build failed due to "bad use of '>'"
> 
>
> Key: YARN-8369
> URL: https://issues.apache.org/jira/browse/YARN-8369
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, docs
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8369.1.patch, YARN-8369.2.patch
>
>
> {noformat}
> $ mvn javadoc:javadoc --projects 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
> ...
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:263:
>  error: bad use of '>'
> [ERROR]* included) has a >0 value.
> [ERROR]  ^
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:266:
>  error: bad use of '>'
> [ERROR]* @return returns true if any resource is >0
> [ERROR]  ^
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8339) Service AM should localize static/archive resource types to container working directory instead of 'resources'

2018-05-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493842#comment-16493842
 ] 

Hudson commented on YARN-8339:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14308 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14308/])
YARN-8339. Service AM should localize static/archive resource types to (wangda: 
rev 3061bfcde53210d2032df3814243498b27a997b7)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/provider/TestProviderUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/provider/ProviderUtils.java


> Service AM should localize static/archive resource types to container working 
> directory instead of 'resources' 
> ---
>
> Key: YARN-8339
> URL: https://issues.apache.org/jira/browse/YARN-8339
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8339.1.patch
>
>
> This is to address one of the review comments posted by [~wangda] in 
> YARN-8079 at 
> https://issues.apache.org/jira/browse/YARN-8079?focusedCommentId=16482065&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16482065



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493824#comment-16493824
 ] 

Eric Yang commented on YARN-8342:
-

[~ebadger] Btw, docker.privileged-containers.enabled is a 
container-executor.cfg property, not a user supplied setting.  Is this how it 
is enforced on your side?

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8369) Javadoc build failed due to "bad use of '>'"

2018-05-29 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8369:
-
Priority: Critical  (was: Major)

> Javadoc build failed due to "bad use of '>'"
> 
>
> Key: YARN-8369
> URL: https://issues.apache.org/jira/browse/YARN-8369
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, docs
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8369.1.patch, YARN-8369.2.patch
>
>
> {noformat}
> $ mvn javadoc:javadoc --projects 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
> ...
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:263:
>  error: bad use of '>'
> [ERROR]* included) has a >0 value.
> [ERROR]  ^
> [ERROR] 
> /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java:266:
>  error: bad use of '>'
> [ERROR]* @return returns true if any resource is >0
> [ERROR]  ^
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8338) TimelineService V1.5 doesn't come up after HADOOP-15406

2018-05-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493812#comment-16493812
 ] 

Hudson commented on YARN-8338:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14307 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14307/])
YARN-8338. TimelineService V1.5 doesn't come up after HADOOP-15406. (jlowe: rev 
31ab960f4f931df273481927b897388895d803ba)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/pom.xml
* (edit) hadoop-project/pom.xml


> TimelineService V1.5 doesn't come up after HADOOP-15406
> ---
>
> Key: YARN-8338
> URL: https://issues.apache.org/jira/browse/YARN-8338
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Fix For: 3.2.0, 3.1.1, 3.0.3
>
> Attachments: YARN-8338.txt
>
>
> TimelineService V1.5 fails with the following:
> {code}
> java.lang.NoClassDefFoundError: org/objenesis/Objenesis
>   at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.(RollingLevelDBTimelineStore.java:174)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Disable containermanager.linux.runtime.TEST to run on Windows

2018-05-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493814#comment-16493814
 ] 

Íñigo Goiri commented on YARN-8359:
---

I would like to get a Yetus run though.
Re-upload?

> Disable containermanager.linux.runtime.TEST to run on Windows
> -
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Disable containermanager.linux.runtime.TEST to run on Windows

2018-05-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493810#comment-16493810
 ] 

Íñigo Goiri commented on YARN-8359:
---

Thanks [~jlowe].
+1 on  [^YARN-8359.001.patch].
We should change the title of the JIRA as it does not disable only the runtime 
but the whole 
{{org.apache.hadoop.yarn.server.nodemanager.containermanager.linux}}.

For the record, the [current Windows 
daily|https://builds.apache.org/job/hadoop-trunk-win/481/testReport/] build has 
almost 30 errors caused by this class.

> Disable containermanager.linux.runtime.TEST to run on Windows
> -
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493804#comment-16493804
 ] 

Eric Yang commented on YARN-8342:
-

[~ebadger] {quote}
Depending on the registry it comes from, yes. If the registry is a black box 
and operated by some 3rd party, then you might not want that image to be run 
with mounts at all.{quote}

Sudo users can easily change the configuration to allow the untrusted registry 
to become trusted.  It would be very difficult to prevent sudo users from 
untrusted registries.  This is a procedure problem rather than coding problem.

{quote}
Since I don't ever want to run a privileged container, it seems prudent to not 
allow users to run them instead of trusting that users won't run them.
{quote}

Let's make sure we agree on the required code fix.  If 
docker.privileged-containers.enabled is disabled, and user put images in 
docker.trusted.registries.  The images in docker.trusted.registries behaves 
like type 2.  When docker.privileged-containers.enabled is enabled, and user 
put images in docker.trusted.registries, images behaves like type 3.  
Registries not described in trusted registries are type 1 regardless of 
docker.privileged-containers.enabled setting.  Hence, the 
docker.privileged-container.registries renamed to docker.trusted.registries can 
address the confusion.  

This JIRA is going to tweak type 1 to allow launch_command to be supplied and 
change docker.privilegd-containers.registries label.  Do we agree this is the 
right safety valves and changes that are going to happen?

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8374) Upgrade objenesis dependency

2018-05-29 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-8374:


 Summary: Upgrade objenesis dependency
 Key: YARN-8374
 URL: https://issues.apache.org/jira/browse/YARN-8374
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineservice
Reporter: Jason Lowe


After HADOOP-14918 is committed we should be able to remove the explicit 
objenesis dependency and objenesis exclusion from the fst dependency to pick up 
the version fst wants naturally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-29 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493774#comment-16493774
 ] 

Billie Rinaldi commented on YARN-8333:
--

Hey [~eyang], I think this is a great idea. I tested out the patch and it 
worked well. My only suggestion is that I think we should add information about 
the new records to ServiceDiscovery.md.

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8333.001.patch, YARN-8333.002.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-05-29 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493761#comment-16493761
 ] 

Eric Payne commented on YARN-4781:
--

Thanks a lot [~sunilg]! I attached patch {{YARN-4781.005.branch-2.patch}}, 
which should apply cleanly to branch-2, branch-2.9 and branch-2.8.

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-4781.001.patch, YARN-4781.002.patch, 
> YARN-4781.003.patch, YARN-4781.004.patch, YARN-4781.005.branch-2.patch, 
> YARN-4781.005.patch
>
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8370) Some nodemanager tests fail on Windows due to improper path/file separator

2018-05-29 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493763#comment-16493763
 ] 

Íñigo Goiri commented on YARN-8370:
---

Note that [~jlowe] has proposed to disable the Linux tests at pom level in 
YARN-8359.
We may want to solve all of them at the same time with that approach.

> Some nodemanager tests fail on Windows due to improper path/file separator
> --
>
> Key: YARN-8370
> URL: https://issues.apache.org/jira/browse/YARN-8370
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Anbang Hu
>Assignee: Anbang Hu
>Priority: Minor
>  Labels: Windows
> Attachments: YARN-8370.000.patch
>
>
> Following nodemanager tests fail on Windows due to path/file separator issue:
> * 
> [TestPrivilegedOperationExecutor#testExecutorPath|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged/TestPrivilegedOperationExecutor/testExecutorPath/]
> * 
> [TestLocalDirsHandlerService#testGetFullDirs|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager/TestLocalDirsHandlerService/testGetFullDirs/]
> * 
> [TestAppLogAggregatorImpl|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation/TestAppLogAggregatorImpl/]
> * 
> [TestCGroupsHandlerImpl#testCGroupOperations|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testCGroupOperations/]
> * 
> [TestCGroupsHandlerImpl#testMtabParsing|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testMtabParsing/]
> * 
> [TestCGroupsHandlerImpl#testCGroupPaths|https://builds.apache.org/job/hadoop-trunk-win/479/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources/TestCGroupsHandlerImpl/testCGroupPaths/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-05-29 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-4781:
-
Attachment: YARN-4781.005.branch-2.patch

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-4781.001.patch, YARN-4781.002.patch, 
> YARN-4781.003.patch, YARN-4781.004.patch, YARN-4781.005.branch-2.patch, 
> YARN-4781.005.patch
>
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8359) Disable containermanager.linux.runtime.TEST to run on Windows

2018-05-29 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned YARN-8359:


Assignee: Jason Lowe

> Disable containermanager.linux.runtime.TEST to run on Windows
> -
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Jason Lowe
>Priority: Major
> Attachments: YARN-8359.001.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8359) Disable containermanager.linux.runtime.TEST to run on Windows

2018-05-29 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493744#comment-16493744
 ] 

Jason Lowe commented on YARN-8359:
--

bq. Is this used somewhere else?

Using the pom to exclude a chunk of tests is currently being done by 
hadoop-kms, hadoop-hdfs-https, hadoop-azure (for failsafe vs. surefire but 
similar concept), hadoop-yarn-server-nodemanager, and hadoop-yarn-registry.


> Disable containermanager.linux.runtime.TEST to run on Windows
> -
>
> Key: YARN-8359
> URL: https://issues.apache.org/jira/browse/YARN-8359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8359.001.patch
>
>
> Some of the tests in containermanager.linux.runtime failed in Windows due to 
> *Error Message*
>  *'posix:permissions' not supported as initial attribute*
> We use PosixFilePermission which can be used only with operating systems 
> which are compatibile with POSIX:
> A file attribute view that provides a view of the file attributes commonly 
> associated with files on file systems used by operating systems that 
> implement the Portable Operating System Interface (POSIX) family of 
> standards.Operating systems that implement the POSIX family of standards 
> commonly use file systems that have a file owner, group-owner, and related 
> access permissions. Windows unfortunatelly doesn't support POSIX file systems 
> so this is why your code doesn't work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-29 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493743#comment-16493743
 ] 

Eric Badger commented on YARN-8342:
---

bq. Is there any reason to block sudo users from running images in 
non-privileged containers with mounts?
Depending on the registry it comes from, yes. If the registry is a black box 
and operated by some 3rd party, then you might not want that image to be run 
with mounts at all.

bq. 3 is a superset of 2. The control valve for privileged container or 
non-privileged container is through sudo check. Privileged and non-privileged 
users can use 3 as 2 without making 2 and 3 as separate support type.
Yes, 3 is a superset of 2. However, I would never use 3 in my cluster. I don't 
want users to run with privileged containers. It increases the surface area for 
bugs related to privileged code and opens up the possibility of users elevating 
their container's privilege just to get something to work, even when that's not 
the correct solution. Since I don't ever want to run a privileged container, it 
seems prudent to not allow users to run them instead of trusting that users 
won't run them. And then of course if you don't care about the distinction 
between the two, then you would simply populate 3 and leave 2 empty. 


> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Eric Yang
>Priority: Critical
>  Labels: Docker
> Attachments: YARN-8342.001.patch
>
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8320) [Umbrella] Support CPU isolation for latency-sensitive (LS) service

2018-05-29 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493538#comment-16493538
 ] 

Weiwei Yang commented on YARN-8320:
---

Thanks [~miklos.szeg...@cloudera.com] for sharing your idea. You were right 
that the original idea was to make this easy to use. That says user doesn't 
need to know about what set of cpus their containers will be running on, and 
how they are configured. They just give us a cpu_share_mode, and we do all the 
tricks underneath without exposing too much details.

My concern about the approach you suggested is
 # It might be complex for user to use
 # It should be able to support 2 modes but not very straightforward to support 
4 modes

Allow me take an example like following:

{noformat}
I have a NM with capacity:
  memory: 10gb
  vcore: 10
  cpus: 10 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

Request with just cpu number:
  memory: 1gb
  vcore: 5
  cpuset: 5

After allocation, my NM capacity updates to 
  memory: 9gb
  vcore: 5
  cpus: 5 (0, 1, 2, 3, 4)
{noformat}

there are few problems with such approach
 # User might get confused how many cpus to apply in the resource request. 
Vcore as of today is already a difficult thing to set, adding a new type of 
resource might make this harder.
 # When #vcore is not same as #processor on NM, user will need do some 
calculation to set a reasonable cpuset value in order not to over/less use cpu 
resource, and this is hard for RM to check as it doesn't have all the info like 
NM did
 # Difficult to support all 4 modes under current resource APIs

Please let me know if there is any wrong in this example and the comments.
I agree we can start from supporting EXCLUSIVE+ANY mode in phase 1, but still 
want to make sure the design is able to extend to support both modes (because 
RESERVED/SHARE modes are very useful to improve utilizations). I will 
consolidate all the comments from you and [~leftnoteasy] and come up with a new 
version of design doc next week. Look forward for your comments always.

Thanks

> [Umbrella] Support CPU isolation for latency-sensitive (LS) service
> ---
>
> Key: YARN-8320
> URL: https://issues.apache.org/jira/browse/YARN-8320
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Jiandan Yang 
>Priority: Major
> Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, 
> CPU-isolation-for-latency-sensitive-services-v2.pdf, YARN-8320.001.patch
>
>
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and 
> “cpu.shares” to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; 
> no support for differentiated latency
>  * Request latency of services running on container may be frequent shake 
> when all containers share cpus, and latency-sensitive services can not afford 
> in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to 
> different processors, this is inspired by the isolation technique in [Borg 
> system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Girish Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Girish Bhat updated YARN-8373:
--
Affects Version/s: 2.9.0

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Priority: Major
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Girish Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Girish Bhat updated YARN-8373:
--
Description: 
 

 
{noformat}
sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 Subversion 
https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
0a76a9a32a5257331741f8d5932f183 This command was run using 
/usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
This is for version 2.9.0 

 
{noformat}
2018-05-25 05:53:12,742 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
rSchedulerContinuousScheduling, that exited unexpectedly: 
java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)

2018-05-25 05:53:12,743 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
the resource manager.
2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
its general contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)

2018-05-25 05:53:12,772 ERROR 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 ExpiredTokenRemover received java.lang.InterruptedException: sleep 
interrupted{noformat}

  was:
 

This is for version 2.9.0 
{noformat}
2018-05-25 05:53:12,742 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
rSchedulerContinuousScheduling, that exited unexpectedly: 
java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)

2018-05-25 05:53:12,743 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
the resource manager.
2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
its general contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.jav

[jira] [Updated] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Girish Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Girish Bhat updated YARN-8373:
--
Environment: (was: {noformat}
sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 Subversion 
https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
0a76a9a32a5257331741f8d5932f183 This command was run using 
/usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
 )

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Girish Bhat
>Priority: Major
>
>  
> This is for version 2.9.0 
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2018-05-29 Thread Girish Bhat (JIRA)
Girish Bhat created YARN-8373:
-

 Summary: RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
 Key: YARN-8373
 URL: https://issues.apache.org/jira/browse/YARN-8373
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
 Environment: {noformat}
sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 Subversion 
https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
0a76a9a32a5257331741f8d5932f183 This command was run using 
/usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
 
Reporter: Girish Bhat


 

This is for version 2.9.0 
{noformat}
2018-05-25 05:53:12,742 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
rSchedulerContinuousScheduling, that exited unexpectedly: 
java.lang.IllegalArgumentException: Comparison method violates its general 
contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)

2018-05-25 05:53:12,743 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
the resource manager.
2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
its general contract!
at java.util.TimSort.mergeHi(TimSort.java:899)
at java.util.TimSort.mergeAt(TimSort.java:516)
at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
at java.util.TimSort.sort(TimSort.java:254)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)

2018-05-25 05:53:12,772 ERROR 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
 ExpiredTokenRemover received java.lang.InterruptedException: sleep 
interrupted{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org