[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN

2019-10-25 Thread zhao bo (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960235#comment-16960235
 ] 

zhao bo commented on YARN-9897:
---

Hi [~eyang],

Thanks very much for raising this, we had already post an Jira issue to infra 
team, see https://issues.apache.org/jira/browse/INFRA-18761 .

The comments of that show we have a long plan to introduce some more Aarch64 VM 
into apache infra team to make more ARM jenkins work, but now the whole process 
is hanged by our VM provider side, they had delayed several times :(. And we 
still push them to make the all resources ready as soon as possible. I think 
the ARM resources will comming soon. ;)

 

Hi , [~christ] ,

Sorry for disturbing you again. :) . Because I see the hadoop side conversation 
mentioned the work about the integration of ARM VMs into Apache Jenkins, so I 
think it's worth to make you to see it too and it's good for the following work 
process about specific more detailed Apache project after the ARM VMs are 
success to go into Jenkins. Thank you very much. 

> Add an Aarch64 CI for YARN
> --
>
> Key: YARN-9897
> URL: https://issues.apache.org/jira/browse/YARN-9897
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Zhenyu Zheng
>Priority: Major
> Attachments: hadoop_build.log
>
>
> As YARN is the resource manager of Hadoop and there are large number of other 
> software that also uses YARN for resource management. The capability of 
> running YARN on platforms with different architecture and managing hardware 
> resources with different architecture could be very important and useful.
> Aarch64(ARM) architecture is currently the dominate architecture in small 
> devices like phone, IOT devices, security cameras, drones etc. With the 
> increasing compuiting capability and the increasing connection speed like 5G 
> network, there could be greate posibility and opportunity for world chaging 
> inovations and new market if we can managing and make use of those devices as 
> well.
> Currently, all YARN CIs are based on x86 architecture and we have been 
> performing tests on Aarch64 and proposing possible solutions for problems we 
> have meet, like:
> https://issues.apache.org/jira/browse/HADOOP-16614
> we have done all YARN tests and it turns out there are only a few problems, 
> and we can provide possible solutions for discussion.
> We want to propose to add an Aarch64 CI for YARN to promote the support for 
> YARN on Aarch64 platforms. We are willing to provide machines to the current 
> CI system and manpower to mananging the CI and fxing problems that occours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN

2019-10-25 Thread Zhenyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960233#comment-16960233
 ] 

Zhenyu Zheng commented on YARN-9897:


[~eyang] Thanks alot for the help and suggestion, we have already contacted 
infra team, and we are now waiting for some new Aarch64 server to be in place 
for donating, we hope it could be done in next week, so let's wait and see. 
Thanks again for the help.

> Add an Aarch64 CI for YARN
> --
>
> Key: YARN-9897
> URL: https://issues.apache.org/jira/browse/YARN-9897
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Zhenyu Zheng
>Priority: Major
> Attachments: hadoop_build.log
>
>
> As YARN is the resource manager of Hadoop and there are large number of other 
> software that also uses YARN for resource management. The capability of 
> running YARN on platforms with different architecture and managing hardware 
> resources with different architecture could be very important and useful.
> Aarch64(ARM) architecture is currently the dominate architecture in small 
> devices like phone, IOT devices, security cameras, drones etc. With the 
> increasing compuiting capability and the increasing connection speed like 5G 
> network, there could be greate posibility and opportunity for world chaging 
> inovations and new market if we can managing and make use of those devices as 
> well.
> Currently, all YARN CIs are based on x86 architecture and we have been 
> performing tests on Aarch64 and proposing possible solutions for problems we 
> have meet, like:
> https://issues.apache.org/jira/browse/HADOOP-16614
> we have done all YARN tests and it turns out there are only a few problems, 
> and we can provide possible solutions for discussion.
> We want to propose to add an Aarch64 CI for YARN to promote the support for 
> YARN on Aarch64 platforms. We are willing to provide machines to the current 
> CI system and manpower to mananging the CI and fxing problems that occours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8982) [Router] Add locality policy

2019-10-25 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960230#comment-16960230
 ] 

Hadoop QA commented on YARN-8982:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
54s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-8982 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12947921/YARN-8982.v2.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d2a403fdff57 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7be5508 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25046/testReport/ |
| Max. process+thread count | 308 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25046/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [Router] Add locality policy 
> -
>
> 

[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-10-25 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960229#comment-16960229
 ] 

Hadoop QA commented on YARN-9561:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 17m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
73m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 17m 10s{color} | 
{color:red} root generated 2 new + 24 unchanged - 2 fixed = 26 total (was 26) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 17m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}156m 33s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}300m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
|   | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.yarn.server.webproxy.TestWebAppProxyServlet |
|   | hadoop.yarn.server.webproxy.amfilter.TestAmFilter |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9561 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984048/YARN-9561.007.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux fe71e223b228 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / eef34f2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| cc | 
https://builds.apache.org/job/PreCommit-YARN-Build/25044/artifact/out/diff-compile-cc-root.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/25044/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25044/testReport/ |
| Max. process+thread count | 3058 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25044/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add C changes for the new 

[jira] [Assigned] (YARN-8982) [Router] Add locality policy

2019-10-25 Thread Young Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Chen reassigned YARN-8982:


Assignee: Young Chen  (was: Giovanni Matteo Fumarola)

> [Router] Add locality policy 
> -
>
> Key: YARN-8982
> URL: https://issues.apache.org/jira/browse/YARN-8982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Young Chen
>Priority: Major
> Attachments: YARN-8982.v1.patch, YARN-8982.v2.patch
>
>
> This jira tracks the effort to add a new policy in the Router.
> This policy will allow the Router to pick the SubCluster based on the node 
> that the client requested.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks

2019-10-25 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960173#comment-16960173
 ] 

Hadoop QA commented on YARN-9914:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
11s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
39s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
15s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 232 unchanged - 0 fixed = 235 total (was 232) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
25s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
52s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
23s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || 

[jira] [Commented] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks

2019-10-25 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960057#comment-16960057
 ] 

Jim Brennan commented on YARN-9914:
---

Thanks [~ebadger]! I've attached a patch for branch-2.8.

> Use separate configs for free disk space checking for full and not-full disks
> -
>
> Key: YARN-9914
> URL: https://issues.apache.org/jira/browse/YARN-9914
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 2.11.0
>
> Attachments: YARN-9914-branch-2.8.001.patch, YARN-9914.001.patch, 
> YARN-9914.002.patch
>
>
> [YARN-3943] added separate configurations for the nodemanager health check 
> disk utilization full disk check:
> {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good 
> disk full
> {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for 
> marking a full disk as not full.
> On our clusters, we do not use these configs. We instead use 
> {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of 
> percent of utilization. We have observed the same oscillation behavior as 
> described in [YARN-3943] with this parameter. I would like to add an optional 
> config to specify a separate threshold for marking a full disk as not full:
> {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full
> {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full 
> disk is marked good.
> So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which 
> would cause a disk to be marked full when free space goes below 5GB, and 
> {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the 
> full state until free space goes above 10GB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks

2019-10-25 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-9914:
--
Attachment: YARN-9914-branch-2.8.001.patch

> Use separate configs for free disk space checking for full and not-full disks
> -
>
> Key: YARN-9914
> URL: https://issues.apache.org/jira/browse/YARN-9914
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 2.11.0
>
> Attachments: YARN-9914-branch-2.8.001.patch, YARN-9914.001.patch, 
> YARN-9914.002.patch
>
>
> [YARN-3943] added separate configurations for the nodemanager health check 
> disk utilization full disk check:
> {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good 
> disk full
> {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for 
> marking a full disk as not full.
> On our clusters, we do not use these configs. We instead use 
> {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of 
> percent of utilization. We have observed the same oscillation behavior as 
> described in [YARN-3943] with this parameter. I would like to add an optional 
> config to specify a separate threshold for marking a full disk as not full:
> {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full
> {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full 
> disk is marked good.
> So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which 
> would cause a disk to be marked full when free space goes below 5GB, and 
> {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the 
> full state until free space goes above 10GB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN

2019-10-25 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960051#comment-16960051
 ] 

Eric Yang commented on YARN-9897:
-

Apache infrastructure team have accepted donation from Apple, Yahoo, and HP for 
the build machines.  Ideally, the machines needs to be connected to 
[build.apache.org|https://build.apache.org].  You may need to contact Apache 
infrastructure team to see how the arm nodes can be donated to Apache.  I do 
not know all the logistics.

The list of enhancement looks like good improvements to Hadoop code base, and 
look forward to discuss each issues separately.  The community can only work on 
these enhancements, if we can reproduce the result.  Hope you get the nodes 
connected to build.apache.org soon.

> Add an Aarch64 CI for YARN
> --
>
> Key: YARN-9897
> URL: https://issues.apache.org/jira/browse/YARN-9897
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Zhenyu Zheng
>Priority: Major
> Attachments: hadoop_build.log
>
>
> As YARN is the resource manager of Hadoop and there are large number of other 
> software that also uses YARN for resource management. The capability of 
> running YARN on platforms with different architecture and managing hardware 
> resources with different architecture could be very important and useful.
> Aarch64(ARM) architecture is currently the dominate architecture in small 
> devices like phone, IOT devices, security cameras, drones etc. With the 
> increasing compuiting capability and the increasing connection speed like 5G 
> network, there could be greate posibility and opportunity for world chaging 
> inovations and new market if we can managing and make use of those devices as 
> well.
> Currently, all YARN CIs are based on x86 architecture and we have been 
> performing tests on Aarch64 and proposing possible solutions for problems we 
> have meet, like:
> https://issues.apache.org/jira/browse/HADOOP-16614
> we have done all YARN tests and it turns out there are only a few problems, 
> and we can provide possible solutions for discussion.
> We want to propose to add an Aarch64 CI for YARN to promote the support for 
> YARN on Aarch64 platforms. We are willing to provide machines to the current 
> CI system and manpower to mananging the CI and fxing problems that occours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-10-25 Thread Eric Badger (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960011#comment-16960011
 ] 

Eric Badger commented on YARN-9561:
---

Attaching patch 007 to fix a lack of test failures on container executor cfg 
setup failure

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, 
> YARN-9561.006.patch, YARN-9561.007.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-10-25 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9561:
--
Attachment: YARN-9561.007.patch

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, 
> YARN-9561.006.patch, YARN-9561.007.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks

2019-10-25 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9914:
--
Fix Version/s: 2.11.0
   3.1.4
   3.2.2
   2.9.3
   3.3.0
   3.0.4
   2.10.0

Thanks for the patch, [~Jim_Brennan]! I cleaned up the small checkstyle issues 
and committed this to trunk, branch-3.2, branch-3.1, branch-3.0, branch-2, 
branch-2.10, and branch-2.9. There was a small conflict with branch-2.8. If 
you'd like it to go back that far, please put up a new patch for that branch. 
Otherwise, feel free to close as resolved.

> Use separate configs for free disk space checking for full and not-full disks
> -
>
> Key: YARN-9914
> URL: https://issues.apache.org/jira/browse/YARN-9914
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 2.11.0
>
> Attachments: YARN-9914.001.patch, YARN-9914.002.patch
>
>
> [YARN-3943] added separate configurations for the nodemanager health check 
> disk utilization full disk check:
> {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good 
> disk full
> {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for 
> marking a full disk as not full.
> On our clusters, we do not use these configs. We instead use 
> {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of 
> percent of utilization. We have observed the same oscillation behavior as 
> described in [YARN-3943] with this parameter. I would like to add an optional 
> config to specify a separate threshold for marking a full disk as not full:
> {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full
> {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full 
> disk is marked good.
> So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which 
> would cause a disk to be marked full when free space goes below 5GB, and 
> {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the 
> full state until free space goes above 10GB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9851) Make execution type check compatible

2019-10-25 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9851:
-
Summary: Make execution type check compatible  (was: Make execution type 
check compatiable)

> Make execution type check compatible
> 
>
> Key: YARN-9851
> URL: https://issues.apache.org/jira/browse/YARN-9851
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-9851-001.patch
>
>
> During upgrade from 2.6 to 3.1, we encountered a problem:
> {code:java}
> 2019-09-23,19:29:05,303 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost 
> container container_e35_1568719110875_6460_08_01, status: RUNNING, 
> execution type: null
> 2019-09-23,19:29:05,303 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost 
> container container_e35_1568886618758_11172_01_62, status: RUNNING, 
> execution type: null
> 2019-09-23,19:29:05,303 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost 
> container container_e35_1568886618758_11172_01_63, status: RUNNING, 
> execution type: null
> 2019-09-23,19:29:05,303 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost 
> container container_e35_1568886618758_11172_01_64, status: RUNNING, 
> execution type: null
> 2019-09-23,19:29:05,303 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost 
> container container_e35_1568886618758_30617_01_06, status: RUNNING, 
> execution type: null
> for (ContainerStatus remoteContainer : containerStatuses) {
>   if (remoteContainer.getState() == ContainerState.RUNNING
>   && remoteContainer.getExecutionType() == ExecutionType.GUARANTEED) {
> nodeContainers.add(remoteContainer.getContainerId());
>   } else {
> LOG.warn("Lost container " + remoteContainer.getContainerId()
> + ", status: " + remoteContainer.getState()
> + ", execution type: " + remoteContainer.getExecutionType());
>   }
> }​
> {code}
> The cause is that we has nm with version 2.6, which do not have executionType 
> for container status.
> We should check here make the upgrade process more tranparently



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks

2019-10-25 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959941#comment-16959941
 ] 

Hudson commented on YARN-9914:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17574 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17574/])
YARN-9914. Use separate configs for free disk space checking for full (ebadger: 
rev eef34f2d87a75e16b2cca870d99a5e1e28c31d9b)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DirectoryCollection.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDirectoryCollection.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> Use separate configs for free disk space checking for full and not-full disks
> -
>
> Key: YARN-9914
> URL: https://issues.apache.org/jira/browse/YARN-9914
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: YARN-9914.001.patch, YARN-9914.002.patch
>
>
> [YARN-3943] added separate configurations for the nodemanager health check 
> disk utilization full disk check:
> {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good 
> disk full
> {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for 
> marking a full disk as not full.
> On our clusters, we do not use these configs. We instead use 
> {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of 
> percent of utilization. We have observed the same oscillation behavior as 
> described in [YARN-3943] with this parameter. I would like to add an optional 
> config to specify a separate threshold for marking a full disk as not full:
> {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full
> {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full 
> disk is marked good.
> So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which 
> would cause a disk to be marked full when free space goes below 5GB, and 
> {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the 
> full state until free space goes above 10GB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8823) Monitor the healthy state of GPU

2019-10-25 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959769#comment-16959769
 ] 

Adam Antal commented on YARN-8823:
--

Hi [~tangzhankun],
Is there any update on this?

> Monitor the healthy state of GPU
> 
>
> Key: YARN-8823
> URL: https://issues.apache.org/jira/browse/YARN-8823
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
>
> We have GPU resource discovered when the NM bootstrap but not updated through 
> later heatbeat with RM. There should be a monitoring mechanism to check GPU 
> healthy status from time to time and also the corresponding handling.
> And YARN-8851 will also handle device's monitoring. There could be some 
> common part between the two.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler

2019-10-25 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959719#comment-16959719
 ] 

Peter Bacsko commented on YARN-9930:


To me this looks like a duplicate. [~cane] please check and close this if it's 
indeed a dup.

> Support max running app logic for CapacityScheduler
> ---
>
> Key: YARN-9930
> URL: https://issues.apache.org/jira/browse/YARN-9930
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9865) Capacity scheduler: add support for combined %user + %secondary_group mapping

2019-10-25 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959662#comment-16959662
 ] 

Peter Bacsko commented on YARN-9865:


[~maniraj...@gmail.com] I agree, this is negligible. Let's skip this checkstyle 
warning.

After a deeper inspection, it looks like you expanded the testcase 
{{testNestedUserQueueWithGroupAsDynamicParentQueue()}}. Problem is, this test 
is already too long and now it has become harder to read. We have two scenarios 
in a single test.

Could you please create a new testcase and name it appropriately? If necessary, 
you can rename the existing one. Ideas: 
{{testNestedUserQueueWithPrimaryGroupAsDynamicParentQueue}} and 
{{testNestedUserQueueWithSecondaryGroupAsDynamicParentQueue}}. Just checked, 
these won't exceed the 80-char limit. 

And again, use shorter assertion messages:
{noformat}
assertEquals("Expected Queue is ", "a", ctx.getQueue());
assertEquals("Expected Secondary Group is ", "asubgroup1",
...
assertEquals("Expected Queue is ", "a", ctx1.getQueue());
assertEquals("Expected Primary Group is ", "agroup", 
ctx1.getParentQueue());
{noformat}

Just "Queue", "Primary group" and "Secondary group".

> Capacity scheduler: add support for combined %user + %secondary_group mapping
> -
>
> Key: YARN-9865
> URL: https://issues.apache.org/jira/browse/YARN-9865
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9865.001.patch, YARN-9865.002.patch
>
>
> Similiar to YARN-9841, but for secondary group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition

2019-10-25 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-9935:
---
Description: 
【Precondition】:
1. Install the cluster
2. *{color:#4C9AFF}WebAppProxyServer service installed in 1 VM and RMs 
installed in 2 VMs{color}*
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"


  was:
【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"



> SSLHandshakeException thrown when HTTPS is enabled in AM web server in one 
> certain condition
> 
>
> Key: YARN-9935
> URL: https://issues.apache.org/jira/browse/YARN-9935
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy
>Reporter: Sushanta Sen
>Priority: Major
>
> 【Precondition】:
> 1. Install the cluster
> 2. *{color:#4C9AFF}WebAppProxyServer service installed in 1 VM and RMs 
> installed in 2 VMs{color}*
> 3. Enables all the HTTPS configuration required 
> yarn.resourcemanager.application-https.policy
> STRICT
> yarn.app.mapreduce.am.webapp.https.enabled
> true
> yarn.app.mapreduce.am.webapp.https.client.auth
> true
> 4. RM HA enabled
> 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
> 6. Cluster should be up and running
> 【Test step】:
> 1.Submit an application
> 2. Open Application Master link from the applicationID from RM UI
> 【Expect Output】:
> No error should be thrown and JOb should be successful
> 【Actual Output】:
> SSLHandshakeException thrown , although Job is successful.
> "javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition

2019-10-25 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-9935:
---
Description: 
【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"


  was:
【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. Active RM is running in VM2, standby in VM1
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"



> SSLHandshakeException thrown when HTTPS is enabled in AM web server in one 
> certain condition
> 
>
> Key: YARN-9935
> URL: https://issues.apache.org/jira/browse/YARN-9935
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy
>Reporter: Sushanta Sen
>Priority: Major
>
> 【Precondition】:
> 1. Install the cluster
> 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
> 3. Enables all the HTTPS configuration required 
> yarn.resourcemanager.application-https.policy
> STRICT
> yarn.app.mapreduce.am.webapp.https.enabled
> true
> yarn.app.mapreduce.am.webapp.https.client.auth
> true
> 4. RM HA enabled
> 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
> 6. Cluster should be up and running
> 【Test step】:
> 1.Submit an application
> 2. Open Application Master link from the applicationID from RM UI
> 【Expect Output】:
> No error should be thrown and JOb should be successful
> 【Actual Output】:
> SSLHandshakeException thrown , although Job is successful.
> "javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition

2019-10-25 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-9935:
--

 Summary: SSLHandshakeException thrown when HTTPS is enabled in AM 
web server in one certain condition
 Key: YARN-9935
 URL: https://issues.apache.org/jira/browse/YARN-9935
 Project: Hadoop YARN
  Issue Type: Bug
  Components: amrmproxy
Reporter: Sushanta Sen


【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. Active RM is running in VM2, standby in VM1
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails

2019-10-25 Thread Kinga Marton (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959643#comment-16959643
 ] 

Kinga Marton commented on YARN-9743:


I think that the test failure is related to the fix, so I will have to fix it.

The install error is because of different versions of 
{{javax.xml.bind:jaxb-api}}. 

> [JDK11] TestTimelineWebServices.testContextFactory fails
> 
>
> Key: YARN-9743
> URL: https://issues.apache.org/jira/browse/YARN-9743
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Kinga Marton
>Priority: Major
> Attachments: YARN-9743.001.patch
>
>
> Tested on OpenJDK 11.0.2 on a Mac.
> Stack trace:
> {noformat}
> [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
> 36.016 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
> [ERROR] 
> testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices)
>   Time elapsed: 1.031 s  <<< ERROR!
> java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
>   at 
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
>   at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
>   at java.base/java.lang.Class.forName0(Native Method)
>   at java.base/java.lang.Class.forName(Class.java:315)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112)
>   at 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN

2019-10-25 Thread Zhenyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959555#comment-16959555
 ] 

Zhenyu Zheng commented on YARN-9897:


[~eyang] Thanks for the support, do you have any suggestion about how we should 
try to promote this idea?

And we have added a job in openlabtesting.org so that people can see what we 
are trying to do.

The job are defined using ansbile playbooks:
https://github.com/theopenlab/openlab-zuul-jobs/blob/master/playbooks/hadoop-yarn-unit-test-arm64/run.yaml

as you can see from the script, we have done few walk-arounds:
1. protocbuf does not support aarch64 but later version supports, we have 
proposed to upgrade it and it is ongoing now:
https://issues.apache.org/jira/browse/HADOOP-13363
in our script, we just cherry-picked the patch made in protobuf to make it work 
and packed a local package, this won't be needed anymore if the upgrade is done.
2.phantomjs does not support aarch64 and we have contacted the author and seems 
the project is not maintained anymore, so we also downloaded the source code 
and built local package, this walk-around could also be removed if we do 
similar things with leveldbjni.
3.netty does not have a support aarch64 and someone is working on it: 
https://github.com/netty/netty/issues/8279, so we also downloaded the source 
code and compile locally, in order to remove this walk-around, we could first 
upload the arm64 package to openlabtesting maven repo and then switch back to 
official one once they got aarch64 supported.
4.protoc-gen-grpc-java lack aarch64 support, we have not yet contact grpc team 
yet, and we are using aajisaka's package, we can also remove this walk-around 
by upload it to openlabtesting maven repo or contact grpc team to see what we 
can do.

Here is the job panel for the job
http://status.openlabtesting.org/job/hadoop-yarn-unit-test-arm64
it is a periodic job and will be running at 10.00 UTC everyday so you can check 
the logs latter by click ``build-history`` and thte ``result`` section for the 
build.

One more thing I want to mention that openlabtesting.org is just the platform 
that we are using to testing now, we are  willing to connect it to the current 
CI system, but it is not mandantory, we are also able to provide servers 
directly to the current CI system if people thinks it is better.

> Add an Aarch64 CI for YARN
> --
>
> Key: YARN-9897
> URL: https://issues.apache.org/jira/browse/YARN-9897
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Zhenyu Zheng
>Priority: Major
> Attachments: hadoop_build.log
>
>
> As YARN is the resource manager of Hadoop and there are large number of other 
> software that also uses YARN for resource management. The capability of 
> running YARN on platforms with different architecture and managing hardware 
> resources with different architecture could be very important and useful.
> Aarch64(ARM) architecture is currently the dominate architecture in small 
> devices like phone, IOT devices, security cameras, drones etc. With the 
> increasing compuiting capability and the increasing connection speed like 5G 
> network, there could be greate posibility and opportunity for world chaging 
> inovations and new market if we can managing and make use of those devices as 
> well.
> Currently, all YARN CIs are based on x86 architecture and we have been 
> performing tests on Aarch64 and proposing possible solutions for problems we 
> have meet, like:
> https://issues.apache.org/jira/browse/HADOOP-16614
> we have done all YARN tests and it turns out there are only a few problems, 
> and we can provide possible solutions for discussion.
> We want to propose to add an Aarch64 CI for YARN to promote the support for 
> YARN on Aarch64 platforms. We are willing to provide machines to the current 
> CI system and manpower to mananging the CI and fxing problems that occours.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9934) LogAggregationService should not submit aggregator when app dir creation fail

2019-10-25 Thread Zizon (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zizon updated YARN-9934:

Attachment: YARN-9934.patch.1

> LogAggregationService should not submit aggregator when app dir creation fail
> -
>
> Key: YARN-9934
> URL: https://issues.apache.org/jira/browse/YARN-9934
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Reporter: Zizon
>Priority: Minor
> Attachments: YARN-9934.patch, YARN-9934.patch.1
>
>
> Before submiting a log aggreation runnable, LogAggregationService  will try 
> to create the aggreated log dir.
> In some case, it may fail(e.g dir num exceed max limit)
>  
> When it did failed and submitted to LogAggregationService, the runnable may 
> run forever if some app statue flip misbehavior(e.g not handling application 
> complete event rightfully,thus keeping appFinishing of AppLogAggregatorImpl 
> be always true).
>  
> In our production(Version 2.7.3), this cause huge number of dangling 
> aggregator(~400+ LogAggregationService threads alive for some node, in which 
> nodemanager configured only 50+ vCPUs).
>  
> The patch try to early throw the creation exception, avoiding starting 
> unnecessary log polling. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org