[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN
[ https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960235#comment-16960235 ] zhao bo commented on YARN-9897: --- Hi [~eyang], Thanks very much for raising this, we had already post an Jira issue to infra team, see https://issues.apache.org/jira/browse/INFRA-18761 . The comments of that show we have a long plan to introduce some more Aarch64 VM into apache infra team to make more ARM jenkins work, but now the whole process is hanged by our VM provider side, they had delayed several times :(. And we still push them to make the all resources ready as soon as possible. I think the ARM resources will comming soon. ;) Hi , [~christ] , Sorry for disturbing you again. :) . Because I see the hadoop side conversation mentioned the work about the integration of ARM VMs into Apache Jenkins, so I think it's worth to make you to see it too and it's good for the following work process about specific more detailed Apache project after the ARM VMs are success to go into Jenkins. Thank you very much. > Add an Aarch64 CI for YARN > -- > > Key: YARN-9897 > URL: https://issues.apache.org/jira/browse/YARN-9897 > Project: Hadoop YARN > Issue Type: Improvement > Components: build, test >Reporter: Zhenyu Zheng >Priority: Major > Attachments: hadoop_build.log > > > As YARN is the resource manager of Hadoop and there are large number of other > software that also uses YARN for resource management. The capability of > running YARN on platforms with different architecture and managing hardware > resources with different architecture could be very important and useful. > Aarch64(ARM) architecture is currently the dominate architecture in small > devices like phone, IOT devices, security cameras, drones etc. With the > increasing compuiting capability and the increasing connection speed like 5G > network, there could be greate posibility and opportunity for world chaging > inovations and new market if we can managing and make use of those devices as > well. > Currently, all YARN CIs are based on x86 architecture and we have been > performing tests on Aarch64 and proposing possible solutions for problems we > have meet, like: > https://issues.apache.org/jira/browse/HADOOP-16614 > we have done all YARN tests and it turns out there are only a few problems, > and we can provide possible solutions for discussion. > We want to propose to add an Aarch64 CI for YARN to promote the support for > YARN on Aarch64 platforms. We are willing to provide machines to the current > CI system and manpower to mananging the CI and fxing problems that occours. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN
[ https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960233#comment-16960233 ] Zhenyu Zheng commented on YARN-9897: [~eyang] Thanks alot for the help and suggestion, we have already contacted infra team, and we are now waiting for some new Aarch64 server to be in place for donating, we hope it could be done in next week, so let's wait and see. Thanks again for the help. > Add an Aarch64 CI for YARN > -- > > Key: YARN-9897 > URL: https://issues.apache.org/jira/browse/YARN-9897 > Project: Hadoop YARN > Issue Type: Improvement > Components: build, test >Reporter: Zhenyu Zheng >Priority: Major > Attachments: hadoop_build.log > > > As YARN is the resource manager of Hadoop and there are large number of other > software that also uses YARN for resource management. The capability of > running YARN on platforms with different architecture and managing hardware > resources with different architecture could be very important and useful. > Aarch64(ARM) architecture is currently the dominate architecture in small > devices like phone, IOT devices, security cameras, drones etc. With the > increasing compuiting capability and the increasing connection speed like 5G > network, there could be greate posibility and opportunity for world chaging > inovations and new market if we can managing and make use of those devices as > well. > Currently, all YARN CIs are based on x86 architecture and we have been > performing tests on Aarch64 and proposing possible solutions for problems we > have meet, like: > https://issues.apache.org/jira/browse/HADOOP-16614 > we have done all YARN tests and it turns out there are only a few problems, > and we can provide possible solutions for discussion. > We want to propose to add an Aarch64 CI for YARN to promote the support for > YARN on Aarch64 platforms. We are willing to provide machines to the current > CI system and manpower to mananging the CI and fxing problems that occours. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960230#comment-16960230 ] Hadoop QA commented on YARN-8982: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 54s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 53s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 21s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-8982 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12947921/YARN-8982.v2.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d2a403fdff57 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7be5508 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25046/testReport/ | | Max. process+thread count | 308 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25046/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > [Router] Add locality policy > - > >
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960229#comment-16960229 ] Hadoop QA commented on YARN-9561: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 17m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 73m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 10s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 17m 10s{color} | {color:red} root generated 2 new + 24 unchanged - 2 fixed = 26 total (was 26) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 17m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}156m 33s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}300m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized | | | hadoop.hdfs.TestMultipleNNPortQOP | | | hadoop.yarn.server.webproxy.TestWebAppProxyServlet | | | hadoop.yarn.server.webproxy.amfilter.TestAmFilter | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9561 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984048/YARN-9561.007.patch | | Optional Tests | dupname asflicense compile cc mvnsite javac unit | | uname | Linux fe71e223b228 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / eef34f2 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | cc | https://builds.apache.org/job/PreCommit-YARN-Build/25044/artifact/out/diff-compile-cc-root.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/25044/artifact/out/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25044/testReport/ | | Max. process+thread count | 3058 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager . U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25044/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add C changes for the new
[jira] [Assigned] (YARN-8982) [Router] Add locality policy
[ https://issues.apache.org/jira/browse/YARN-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen reassigned YARN-8982: Assignee: Young Chen (was: Giovanni Matteo Fumarola) > [Router] Add locality policy > - > > Key: YARN-8982 > URL: https://issues.apache.org/jira/browse/YARN-8982 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Young Chen >Priority: Major > Attachments: YARN-8982.v1.patch, YARN-8982.v2.patch > > > This jira tracks the effort to add a new policy in the Router. > This policy will allow the Router to pick the SubCluster based on the node > that the client requested. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks
[ https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960173#comment-16960173 ] Hadoop QA commented on YARN-9914: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.8 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 11s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 39s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 23s{color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 34s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 3 new + 232 unchanged - 0 fixed = 235 total (was 232) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 52s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 72m 4s{color} | {color:black} {color} | \\ \\ || Subsystem ||
[jira] [Commented] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks
[ https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960057#comment-16960057 ] Jim Brennan commented on YARN-9914: --- Thanks [~ebadger]! I've attached a patch for branch-2.8. > Use separate configs for free disk space checking for full and not-full disks > - > > Key: YARN-9914 > URL: https://issues.apache.org/jira/browse/YARN-9914 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 2.11.0 > > Attachments: YARN-9914-branch-2.8.001.patch, YARN-9914.001.patch, > YARN-9914.002.patch > > > [YARN-3943] added separate configurations for the nodemanager health check > disk utilization full disk check: > {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good > disk full > {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for > marking a full disk as not full. > On our clusters, we do not use these configs. We instead use > {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of > percent of utilization. We have observed the same oscillation behavior as > described in [YARN-3943] with this parameter. I would like to add an optional > config to specify a separate threshold for marking a full disk as not full: > {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full > {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full > disk is marked good. > So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which > would cause a disk to be marked full when free space goes below 5GB, and > {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the > full state until free space goes above 10GB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks
[ https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-9914: -- Attachment: YARN-9914-branch-2.8.001.patch > Use separate configs for free disk space checking for full and not-full disks > - > > Key: YARN-9914 > URL: https://issues.apache.org/jira/browse/YARN-9914 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 2.11.0 > > Attachments: YARN-9914-branch-2.8.001.patch, YARN-9914.001.patch, > YARN-9914.002.patch > > > [YARN-3943] added separate configurations for the nodemanager health check > disk utilization full disk check: > {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good > disk full > {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for > marking a full disk as not full. > On our clusters, we do not use these configs. We instead use > {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of > percent of utilization. We have observed the same oscillation behavior as > described in [YARN-3943] with this parameter. I would like to add an optional > config to specify a separate threshold for marking a full disk as not full: > {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full > {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full > disk is marked good. > So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which > would cause a disk to be marked full when free space goes below 5GB, and > {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the > full state until free space goes above 10GB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN
[ https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960051#comment-16960051 ] Eric Yang commented on YARN-9897: - Apache infrastructure team have accepted donation from Apple, Yahoo, and HP for the build machines. Ideally, the machines needs to be connected to [build.apache.org|https://build.apache.org]. You may need to contact Apache infrastructure team to see how the arm nodes can be donated to Apache. I do not know all the logistics. The list of enhancement looks like good improvements to Hadoop code base, and look forward to discuss each issues separately. The community can only work on these enhancements, if we can reproduce the result. Hope you get the nodes connected to build.apache.org soon. > Add an Aarch64 CI for YARN > -- > > Key: YARN-9897 > URL: https://issues.apache.org/jira/browse/YARN-9897 > Project: Hadoop YARN > Issue Type: Improvement > Components: build, test >Reporter: Zhenyu Zheng >Priority: Major > Attachments: hadoop_build.log > > > As YARN is the resource manager of Hadoop and there are large number of other > software that also uses YARN for resource management. The capability of > running YARN on platforms with different architecture and managing hardware > resources with different architecture could be very important and useful. > Aarch64(ARM) architecture is currently the dominate architecture in small > devices like phone, IOT devices, security cameras, drones etc. With the > increasing compuiting capability and the increasing connection speed like 5G > network, there could be greate posibility and opportunity for world chaging > inovations and new market if we can managing and make use of those devices as > well. > Currently, all YARN CIs are based on x86 architecture and we have been > performing tests on Aarch64 and proposing possible solutions for problems we > have meet, like: > https://issues.apache.org/jira/browse/HADOOP-16614 > we have done all YARN tests and it turns out there are only a few problems, > and we can provide possible solutions for discussion. > We want to propose to add an Aarch64 CI for YARN to promote the support for > YARN on Aarch64 platforms. We are willing to provide machines to the current > CI system and manpower to mananging the CI and fxing problems that occours. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960011#comment-16960011 ] Eric Badger commented on YARN-9561: --- Attaching patch 007 to fix a lack of test failures on container executor cfg setup failure > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, > YARN-9561.006.patch, YARN-9561.007.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9561: -- Attachment: YARN-9561.007.patch > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, > YARN-9561.006.patch, YARN-9561.007.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks
[ https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9914: -- Fix Version/s: 2.11.0 3.1.4 3.2.2 2.9.3 3.3.0 3.0.4 2.10.0 Thanks for the patch, [~Jim_Brennan]! I cleaned up the small checkstyle issues and committed this to trunk, branch-3.2, branch-3.1, branch-3.0, branch-2, branch-2.10, and branch-2.9. There was a small conflict with branch-2.8. If you'd like it to go back that far, please put up a new patch for that branch. Otherwise, feel free to close as resolved. > Use separate configs for free disk space checking for full and not-full disks > - > > Key: YARN-9914 > URL: https://issues.apache.org/jira/browse/YARN-9914 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.2.2, 3.1.4, 2.11.0 > > Attachments: YARN-9914.001.patch, YARN-9914.002.patch > > > [YARN-3943] added separate configurations for the nodemanager health check > disk utilization full disk check: > {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good > disk full > {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for > marking a full disk as not full. > On our clusters, we do not use these configs. We instead use > {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of > percent of utilization. We have observed the same oscillation behavior as > described in [YARN-3943] with this parameter. I would like to add an optional > config to specify a separate threshold for marking a full disk as not full: > {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full > {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full > disk is marked good. > So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which > would cause a disk to be marked full when free space goes below 5GB, and > {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the > full state until free space goes above 10GB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9851) Make execution type check compatible
[ https://issues.apache.org/jira/browse/YARN-9851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9851: - Summary: Make execution type check compatible (was: Make execution type check compatiable) > Make execution type check compatible > > > Key: YARN-9851 > URL: https://issues.apache.org/jira/browse/YARN-9851 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9851-001.patch > > > During upgrade from 2.6 to 3.1, we encountered a problem: > {code:java} > 2019-09-23,19:29:05,303 WARN > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost > container container_e35_1568719110875_6460_08_01, status: RUNNING, > execution type: null > 2019-09-23,19:29:05,303 WARN > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost > container container_e35_1568886618758_11172_01_62, status: RUNNING, > execution type: null > 2019-09-23,19:29:05,303 WARN > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost > container container_e35_1568886618758_11172_01_63, status: RUNNING, > execution type: null > 2019-09-23,19:29:05,303 WARN > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost > container container_e35_1568886618758_11172_01_64, status: RUNNING, > execution type: null > 2019-09-23,19:29:05,303 WARN > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Lost > container container_e35_1568886618758_30617_01_06, status: RUNNING, > execution type: null > for (ContainerStatus remoteContainer : containerStatuses) { > if (remoteContainer.getState() == ContainerState.RUNNING > && remoteContainer.getExecutionType() == ExecutionType.GUARANTEED) { > nodeContainers.add(remoteContainer.getContainerId()); > } else { > LOG.warn("Lost container " + remoteContainer.getContainerId() > + ", status: " + remoteContainer.getState() > + ", execution type: " + remoteContainer.getExecutionType()); > } > } > {code} > The cause is that we has nm with version 2.6, which do not have executionType > for container status. > We should check here make the upgrade process more tranparently -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks
[ https://issues.apache.org/jira/browse/YARN-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959941#comment-16959941 ] Hudson commented on YARN-9914: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17574 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17574/]) YARN-9914. Use separate configs for free disk space checking for full (ebadger: rev eef34f2d87a75e16b2cca870d99a5e1e28c31d9b) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DirectoryCollection.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDirectoryCollection.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml > Use separate configs for free disk space checking for full and not-full disks > - > > Key: YARN-9914 > URL: https://issues.apache.org/jira/browse/YARN-9914 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Attachments: YARN-9914.001.patch, YARN-9914.002.patch > > > [YARN-3943] added separate configurations for the nodemanager health check > disk utilization full disk check: > {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good > disk full > {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for > marking a full disk as not full. > On our clusters, we do not use these configs. We instead use > {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of > percent of utilization. We have observed the same oscillation behavior as > described in [YARN-3943] with this parameter. I would like to add an optional > config to specify a separate threshold for marking a full disk as not full: > {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full > {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full > disk is marked good. > So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which > would cause a disk to be marked full when free space goes below 5GB, and > {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the > full state until free space goes above 10GB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8823) Monitor the healthy state of GPU
[ https://issues.apache.org/jira/browse/YARN-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959769#comment-16959769 ] Adam Antal commented on YARN-8823: -- Hi [~tangzhankun], Is there any update on this? > Monitor the healthy state of GPU > > > Key: YARN-8823 > URL: https://issues.apache.org/jira/browse/YARN-8823 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhankun Tang >Assignee: Zhankun Tang >Priority: Major > > We have GPU resource discovered when the NM bootstrap but not updated through > later heatbeat with RM. There should be a monitoring mechanism to check GPU > healthy status from time to time and also the corresponding handling. > And YARN-8851 will also handle device's monitoring. There could be some > common part between the two. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959719#comment-16959719 ] Peter Bacsko commented on YARN-9930: To me this looks like a duplicate. [~cane] please check and close this if it's indeed a dup. > Support max running app logic for CapacityScheduler > --- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler >Affects Versions: 3.1.0, 3.1.1 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9865) Capacity scheduler: add support for combined %user + %secondary_group mapping
[ https://issues.apache.org/jira/browse/YARN-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959662#comment-16959662 ] Peter Bacsko commented on YARN-9865: [~maniraj...@gmail.com] I agree, this is negligible. Let's skip this checkstyle warning. After a deeper inspection, it looks like you expanded the testcase {{testNestedUserQueueWithGroupAsDynamicParentQueue()}}. Problem is, this test is already too long and now it has become harder to read. We have two scenarios in a single test. Could you please create a new testcase and name it appropriately? If necessary, you can rename the existing one. Ideas: {{testNestedUserQueueWithPrimaryGroupAsDynamicParentQueue}} and {{testNestedUserQueueWithSecondaryGroupAsDynamicParentQueue}}. Just checked, these won't exceed the 80-char limit. And again, use shorter assertion messages: {noformat} assertEquals("Expected Queue is ", "a", ctx.getQueue()); assertEquals("Expected Secondary Group is ", "asubgroup1", ... assertEquals("Expected Queue is ", "a", ctx1.getQueue()); assertEquals("Expected Primary Group is ", "agroup", ctx1.getParentQueue()); {noformat} Just "Queue", "Primary group" and "Secondary group". > Capacity scheduler: add support for combined %user + %secondary_group mapping > - > > Key: YARN-9865 > URL: https://issues.apache.org/jira/browse/YARN-9865 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9865.001.patch, YARN-9865.002.patch > > > Similiar to YARN-9841, but for secondary group. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition
[ https://issues.apache.org/jira/browse/YARN-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanta Sen updated YARN-9935: --- Description: 【Precondition】: 1. Install the cluster 2. *{color:#4C9AFF}WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs{color}* 3. Enables all the HTTPS configuration required yarn.resourcemanager.application-https.policy STRICT yarn.app.mapreduce.am.webapp.https.enabled true yarn.app.mapreduce.am.webapp.https.client.auth true 4. RM HA enabled 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}* 6. Cluster should be up and running 【Test step】: 1.Submit an application 2. Open Application Master link from the applicationID from RM UI 【Expect Output】: No error should be thrown and JOb should be successful 【Actual Output】: SSLHandshakeException thrown , although Job is successful. "javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" was: 【Precondition】: 1. Install the cluster 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs 3. Enables all the HTTPS configuration required yarn.resourcemanager.application-https.policy STRICT yarn.app.mapreduce.am.webapp.https.enabled true yarn.app.mapreduce.am.webapp.https.client.auth true 4. RM HA enabled 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}* 6. Cluster should be up and running 【Test step】: 1.Submit an application 2. Open Application Master link from the applicationID from RM UI 【Expect Output】: No error should be thrown and JOb should be successful 【Actual Output】: SSLHandshakeException thrown , although Job is successful. "javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" > SSLHandshakeException thrown when HTTPS is enabled in AM web server in one > certain condition > > > Key: YARN-9935 > URL: https://issues.apache.org/jira/browse/YARN-9935 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Reporter: Sushanta Sen >Priority: Major > > 【Precondition】: > 1. Install the cluster > 2. *{color:#4C9AFF}WebAppProxyServer service installed in 1 VM and RMs > installed in 2 VMs{color}* > 3. Enables all the HTTPS configuration required > yarn.resourcemanager.application-https.policy > STRICT > yarn.app.mapreduce.am.webapp.https.enabled > true > yarn.app.mapreduce.am.webapp.https.client.auth > true > 4. RM HA enabled > 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}* > 6. Cluster should be up and running > 【Test step】: > 1.Submit an application > 2. Open Application Master link from the applicationID from RM UI > 【Expect Output】: > No error should be thrown and JOb should be successful > 【Actual Output】: > SSLHandshakeException thrown , although Job is successful. > "javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition
[ https://issues.apache.org/jira/browse/YARN-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanta Sen updated YARN-9935: --- Description: 【Precondition】: 1. Install the cluster 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs 3. Enables all the HTTPS configuration required yarn.resourcemanager.application-https.policy STRICT yarn.app.mapreduce.am.webapp.https.enabled true yarn.app.mapreduce.am.webapp.https.client.auth true 4. RM HA enabled 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}* 6. Cluster should be up and running 【Test step】: 1.Submit an application 2. Open Application Master link from the applicationID from RM UI 【Expect Output】: No error should be thrown and JOb should be successful 【Actual Output】: SSLHandshakeException thrown , although Job is successful. "javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" was: 【Precondition】: 1. Install the cluster 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs 3. Enables all the HTTPS configuration required yarn.resourcemanager.application-https.policy STRICT yarn.app.mapreduce.am.webapp.https.enabled true yarn.app.mapreduce.am.webapp.https.client.auth true 4. RM HA enabled 5. Active RM is running in VM2, standby in VM1 6. Cluster should be up and running 【Test step】: 1.Submit an application 2. Open Application Master link from the applicationID from RM UI 【Expect Output】: No error should be thrown and JOb should be successful 【Actual Output】: SSLHandshakeException thrown , although Job is successful. "javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" > SSLHandshakeException thrown when HTTPS is enabled in AM web server in one > certain condition > > > Key: YARN-9935 > URL: https://issues.apache.org/jira/browse/YARN-9935 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Reporter: Sushanta Sen >Priority: Major > > 【Precondition】: > 1. Install the cluster > 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs > 3. Enables all the HTTPS configuration required > yarn.resourcemanager.application-https.policy > STRICT > yarn.app.mapreduce.am.webapp.https.enabled > true > yarn.app.mapreduce.am.webapp.https.client.auth > true > 4. RM HA enabled > 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}* > 6. Cluster should be up and running > 【Test step】: > 1.Submit an application > 2. Open Application Master link from the applicationID from RM UI > 【Expect Output】: > No error should be thrown and JOb should be successful > 【Actual Output】: > SSLHandshakeException thrown , although Job is successful. > "javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition
Sushanta Sen created YARN-9935: -- Summary: SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition Key: YARN-9935 URL: https://issues.apache.org/jira/browse/YARN-9935 Project: Hadoop YARN Issue Type: Bug Components: amrmproxy Reporter: Sushanta Sen 【Precondition】: 1. Install the cluster 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs 3. Enables all the HTTPS configuration required yarn.resourcemanager.application-https.policy STRICT yarn.app.mapreduce.am.webapp.https.enabled true yarn.app.mapreduce.am.webapp.https.client.auth true 4. RM HA enabled 5. Active RM is running in VM2, standby in VM1 6. Cluster should be up and running 【Test step】: 1.Submit an application 2. Open Application Master link from the applicationID from RM UI 【Expect Output】: No error should be thrown and JOb should be successful 【Actual Output】: SSLHandshakeException thrown , although Job is successful. "javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target" -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails
[ https://issues.apache.org/jira/browse/YARN-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959643#comment-16959643 ] Kinga Marton commented on YARN-9743: I think that the test failure is related to the fix, so I will have to fix it. The install error is because of different versions of {{javax.xml.bind:jaxb-api}}. > [JDK11] TestTimelineWebServices.testContextFactory fails > > > Key: YARN-9743 > URL: https://issues.apache.org/jira/browse/YARN-9743 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineservice >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Kinga Marton >Priority: Major > Attachments: YARN-9743.001.patch > > > Tested on OpenJDK 11.0.2 on a Mac. > Stack trace: > {noformat} > [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: > 36.016 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices > [ERROR] > testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices) > Time elapsed: 1.031 s <<< ERROR! > java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory > at > java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583) > at > java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112) > at > org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9897) Add an Aarch64 CI for YARN
[ https://issues.apache.org/jira/browse/YARN-9897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959555#comment-16959555 ] Zhenyu Zheng commented on YARN-9897: [~eyang] Thanks for the support, do you have any suggestion about how we should try to promote this idea? And we have added a job in openlabtesting.org so that people can see what we are trying to do. The job are defined using ansbile playbooks: https://github.com/theopenlab/openlab-zuul-jobs/blob/master/playbooks/hadoop-yarn-unit-test-arm64/run.yaml as you can see from the script, we have done few walk-arounds: 1. protocbuf does not support aarch64 but later version supports, we have proposed to upgrade it and it is ongoing now: https://issues.apache.org/jira/browse/HADOOP-13363 in our script, we just cherry-picked the patch made in protobuf to make it work and packed a local package, this won't be needed anymore if the upgrade is done. 2.phantomjs does not support aarch64 and we have contacted the author and seems the project is not maintained anymore, so we also downloaded the source code and built local package, this walk-around could also be removed if we do similar things with leveldbjni. 3.netty does not have a support aarch64 and someone is working on it: https://github.com/netty/netty/issues/8279, so we also downloaded the source code and compile locally, in order to remove this walk-around, we could first upload the arm64 package to openlabtesting maven repo and then switch back to official one once they got aarch64 supported. 4.protoc-gen-grpc-java lack aarch64 support, we have not yet contact grpc team yet, and we are using aajisaka's package, we can also remove this walk-around by upload it to openlabtesting maven repo or contact grpc team to see what we can do. Here is the job panel for the job http://status.openlabtesting.org/job/hadoop-yarn-unit-test-arm64 it is a periodic job and will be running at 10.00 UTC everyday so you can check the logs latter by click ``build-history`` and thte ``result`` section for the build. One more thing I want to mention that openlabtesting.org is just the platform that we are using to testing now, we are willing to connect it to the current CI system, but it is not mandantory, we are also able to provide servers directly to the current CI system if people thinks it is better. > Add an Aarch64 CI for YARN > -- > > Key: YARN-9897 > URL: https://issues.apache.org/jira/browse/YARN-9897 > Project: Hadoop YARN > Issue Type: Improvement > Components: build, test >Reporter: Zhenyu Zheng >Priority: Major > Attachments: hadoop_build.log > > > As YARN is the resource manager of Hadoop and there are large number of other > software that also uses YARN for resource management. The capability of > running YARN on platforms with different architecture and managing hardware > resources with different architecture could be very important and useful. > Aarch64(ARM) architecture is currently the dominate architecture in small > devices like phone, IOT devices, security cameras, drones etc. With the > increasing compuiting capability and the increasing connection speed like 5G > network, there could be greate posibility and opportunity for world chaging > inovations and new market if we can managing and make use of those devices as > well. > Currently, all YARN CIs are based on x86 architecture and we have been > performing tests on Aarch64 and proposing possible solutions for problems we > have meet, like: > https://issues.apache.org/jira/browse/HADOOP-16614 > we have done all YARN tests and it turns out there are only a few problems, > and we can provide possible solutions for discussion. > We want to propose to add an Aarch64 CI for YARN to promote the support for > YARN on Aarch64 platforms. We are willing to provide machines to the current > CI system and manpower to mananging the CI and fxing problems that occours. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9934) LogAggregationService should not submit aggregator when app dir creation fail
[ https://issues.apache.org/jira/browse/YARN-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zizon updated YARN-9934: Attachment: YARN-9934.patch.1 > LogAggregationService should not submit aggregator when app dir creation fail > - > > Key: YARN-9934 > URL: https://issues.apache.org/jira/browse/YARN-9934 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Zizon >Priority: Minor > Attachments: YARN-9934.patch, YARN-9934.patch.1 > > > Before submiting a log aggreation runnable, LogAggregationService will try > to create the aggreated log dir. > In some case, it may fail(e.g dir num exceed max limit) > > When it did failed and submitted to LogAggregationService, the runnable may > run forever if some app statue flip misbehavior(e.g not handling application > complete event rightfully,thus keeping appFinishing of AppLogAggregatorImpl > be always true). > > In our production(Version 2.7.3), this cause huge number of dangling > aggregator(~400+ LogAggregationService threads alive for some node, in which > nodemanager configured only 50+ vCPUs). > > The patch try to early throw the creation exception, avoiding starting > unnecessary log polling. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org