[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE
[ https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275120#comment-15275120 ] Hudson commented on YARN-5048: -- FAILURE: Integrated in Hadoop-trunk-Commit #9734 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9734/]) YARN-5048. DelegationTokenRenewer#skipTokenRenewal may throw NPE (Jian (yzhang: rev 47c41e7ac7e6b905a58550f8899f629c1cf8b138) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java > DelegationTokenRenewer#skipTokenRenewal may throw NPE > -- > > Key: YARN-5048 > URL: https://issues.apache.org/jira/browse/YARN-5048 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Fix For: 2.8.0 > > Attachments: YARN-5048.1.patch, YARN-5048.2.patch > > > {{((Token)token).decodeIdentifier()}} may > throw NPE if RM does not have the corresponding toke kind class. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4996) Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or better yet parameterized
[ https://issues.apache.org/jira/browse/YARN-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275119#comment-15275119 ] Hadoop QA commented on YARN-4996: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 2 new + 7 unchanged - 0 fixed = 9 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 29m 11s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 26s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 76m 3s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802785/YARN-4996.01.patch
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275115#comment-15275115 ] Hadoop QA commented on YARN-5045: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 15 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 0s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 53s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 29s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 51s {color} | {color:green} YARN-2928 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped branch modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patch modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 8s {color} | {color:green} hadoop-project in the patch passed
[jira] [Created] (YARN-5057) resourcemanager.security.TestDelegationTokenRenewer fails in trunk
Yongjun Zhang created YARN-5057: --- Summary: resourcemanager.security.TestDelegationTokenRenewer fails in trunk Key: YARN-5057 URL: https://issues.apache.org/jira/browse/YARN-5057 Project: Hadoop YARN Issue Type: Bug Reporter: Yongjun Zhang The following test seems to fail easily if not always: {code} --- T E S T S --- Running org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 42.996 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer testCancelWithMultipleAppSubmissions(org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer) Time elapsed: 0.382 sec <<< FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertFalse(Assert.java:64) at org.junit.Assert.assertFalse(Assert.java:74) at org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer.testCancelWithMultipleAppSubmissions(TestDelegationTokenRenewer.java:1134) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4842) "yarn logs" command should not require the appOwner argument
[ https://issues.apache.org/jira/browse/YARN-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275083#comment-15275083 ] Hadoop QA commented on YARN-4842: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 13s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 1 new + 77 unchanged - 17 fixed = 78 total (was 94) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 6s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 19s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {
[jira] [Commented] (YARN-2888) Corrective mechanisms for rebalancing NM container queues
[ https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275059#comment-15275059 ] Hadoop QA commented on YARN-2888: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 54s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 53s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 46s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in trunk has 3 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 13s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 3m 58s {color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.8.0_91 with JDK v1.8.0_91 generated 1 new + 22 unchanged - 0 fixed = 23 total (was 22) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 6m 8s {color} | {color:red} hadoop-yarn-project_hadoop-yarn-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 new + 25 unchanged - 0 fixed = 26 total (was 25) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 44s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 12 new + 468 unchanged - 65 fixed = 480 total (was 533) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 27s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 6s {color} | {color:green} the patch passed with
[jira] [Updated] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Zhi updated YARN-4676: - Attachment: YARN-4676.014.patch YARN-4676.014.patch contains review comments update: 1. HostsFileReader support XML format; 2. DecommissioningNodesWatcher has its own poll timer; 3. RMNodeDecommissioningEvent; 4. Other minor review comments; Unfortunately new UntrackedNode logic (YARN-4311 Thu May 5 14:07:54 2016) introduces key conflicts that requires more time to integrate and verify. > Automatic and Asynchronous Decommissioning Nodes Status Tracking > > > Key: YARN-4676 > URL: https://issues.apache.org/jira/browse/YARN-4676 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Zhi >Assignee: Daniel Zhi > Labels: features > Attachments: GracefulDecommissionYarnNode.pdf, > GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, > YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, > YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, > YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch > > > YARN-4676 implements an automatic, asynchronous and flexible mechanism to > graceful decommission > YARN nodes. After user issues the refreshNodes request, ResourceManager > automatically evaluates > status of all affected nodes to kicks out decommission or recommission > actions. RM asynchronously > tracks container and application status related to DECOMMISSIONING nodes to > decommission the > nodes immediately after there are ready to be decommissioned. Decommissioning > timeout at individual > nodes granularity is supported and could be dynamically updated. The > mechanism naturally supports multiple > independent graceful decommissioning “sessions” where each one involves > different sets of nodes with > different timeout settings. Such support is ideal and necessary for graceful > decommission request issued > by external cluster management software instead of human. > DecommissioningNodeWatcher inside ResourceTrackingService tracks > DECOMMISSIONING nodes status automatically and asynchronously after > client/admin made the graceful decommission request. It tracks > DECOMMISSIONING nodes status to decide when, after all running containers on > the node have completed, will be transitioned into DECOMMISSIONED state. > NodesListManager detect and handle include and exclude list changes to kick > out decommission or recommission as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4844) Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275022#comment-15275022 ] Hadoop QA commented on YARN-4844: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} YARN-4844 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802770/YARN-4844.6.patch | | JIRA Issue | YARN-4844 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/11370/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource > > > Key: YARN-4844 > URL: https://issues.apache.org/jira/browse/YARN-4844 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-4844.1.patch, YARN-4844.2.patch, YARN-4844.3.patch, > YARN-4844.4.patch, YARN-4844.5.patch, YARN-4844.6.patch > > > We use int32 for memory now, if a cluster has 10k nodes, each node has 210G > memory, we will get a negative total cluster memory. > And another case that easier overflows int32 is: we added all pending > resources of running apps to cluster's total pending resources. If a > problematic app requires too much resources (let's say 1M+ containers, each > of them has 3G containers), int32 will be not enough. > Even if we can cap each app's pending request, we cannot handle the case that > there're many running apps, each of them has capped but still significant > numbers of pending resources. > So we may possibly need to add getMemoryLong/getVirtualCoreLong to > o.a.h.y.api.records.Resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4996) Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or better yet parameterized
[ https://issues.apache.org/jira/browse/YARN-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated YARN-4996: - Attachment: YARN-4996.01.patch > Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or > better yet parameterized > -- > > Key: YARN-4996 > URL: https://issues.apache.org/jira/browse/YARN-4996 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager, test >Affects Versions: 2.8.0 >Reporter: Daniel Templeton >Priority: Minor > Labels: newbie > Attachments: YARN-4996.01.patch > > > The test tests only the capacity scheduler. It should also test fair > scheduler. At a bare minimum, it should use the default scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275003#comment-15275003 ] Sangjin Lee commented on YARN-5045: --- Rekicked jenkins. Timelineservice and timelineservice-hbase-tests pass fine locally. > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.02.patch, YARN-5045-YARN-2928.03.patch, > YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504): > {noformat} > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/metrics/MetricsServlet > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:33
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274992#comment-15274992 ] Hadoop QA commented on YARN-5045: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s {color} | {color:green} The patch appears to include 15 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 15s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 53s {color} | {color:green} YARN-2928 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped branch modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 10s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 7s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patch modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s {color} | {color:green} hadoop-project in the patch passed
[jira] [Commented] (YARN-4842) "yarn logs" command should not require the appOwner argument
[ https://issues.apache.org/jira/browse/YARN-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274986#comment-15274986 ] Xuan Gong commented on YARN-4842: - Thanks for the review. Attached a new patch to address the comments > "yarn logs" command should not require the appOwner argument > > > Key: YARN-4842 > URL: https://issues.apache.org/jira/browse/YARN-4842 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ram Venkatesh >Assignee: Xuan Gong > Attachments: YARN-4842.1.patch, YARN-4842.2.patch, YARN-4842.3.patch, > YARN-4842.4.patch, YARN-4842.5.patch > > > The yarn logs command is among the most common ways to troubleshoot yarn app > failures, especially by an admin. > Currently if you run the command as a user different from the job owner, the > command will fail with a subtle message that it could not find the app under > the running user's name. This can be confusing especially to new admins. > We can figure out the job owner from the app report returned by the RM or the > AHS, or, by looking for the app directory using a glob pattern, so in most > cases this error can be avoided. > Question - are there scenarios where users will still need to specify the > -appOwner option? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4842) "yarn logs" command should not require the appOwner argument
[ https://issues.apache.org/jira/browse/YARN-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4842: Attachment: YARN-4842.5.patch > "yarn logs" command should not require the appOwner argument > > > Key: YARN-4842 > URL: https://issues.apache.org/jira/browse/YARN-4842 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ram Venkatesh >Assignee: Xuan Gong > Attachments: YARN-4842.1.patch, YARN-4842.2.patch, YARN-4842.3.patch, > YARN-4842.4.patch, YARN-4842.5.patch > > > The yarn logs command is among the most common ways to troubleshoot yarn app > failures, especially by an admin. > Currently if you run the command as a user different from the job owner, the > command will fail with a subtle message that it could not find the app under > the running user's name. This can be confusing especially to new admins. > We can figure out the job owner from the app report returned by the RM or the > AHS, or, by looking for the app directory using a glob pattern, so in most > cases this error can be avoided. > Question - are there scenarios where users will still need to specify the > -appOwner option? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4747) AHS error 500 due to NPE when container start event is missing
[ https://issues.apache.org/jira/browse/YARN-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274964#comment-15274964 ] Hudson commented on YARN-4747: -- FAILURE: Integrated in Hadoop-trunk-Commit #9731 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9731/]) YARN-4747. AHS error 500 due to NPE when container start event is (jlowe: rev b2ed6ae73197990a950ce71ece80c0f23221c384) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/ContainerFinishedEvent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/SystemMetricsPublisher.java > AHS error 500 due to NPE when container start event is missing > -- > > Key: YARN-4747 > URL: https://issues.apache.org/jira/browse/YARN-4747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.2 >Reporter: Jason Lowe >Assignee: Varun Saxena > Fix For: 2.8.0, 2.7.3 > > Attachments: YARN-4747.01.patch > > > Saw an error 500 due to a NullPointerException caused by a missing host for > an AM container. Stacktrace to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
[ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274950#comment-15274950 ] Junping Du commented on YARN-2140: -- Hi [~sidharta-s], I noticed all sub jiras are resolved. Do we have any work left to do? If not, we should mark this umbrella as resolved. > Add support for network IO isolation/scheduling for containers > -- > > Key: YARN-2140 > URL: https://issues.apache.org/jira/browse/YARN-2140 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wei Yan >Assignee: Sidharta Seethana > Attachments: NetworkAsAResourceDesign.pdf > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4842) "yarn logs" command should not require the appOwner argument
[ https://issues.apache.org/jira/browse/YARN-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274930#comment-15274930 ] Vinod Kumar Vavilapalli commented on YARN-4842: --- This is much closer. Couple of comments - When running into non-readable directories, we should explicitly catch AccessControlException, print an error and return -1 instead of throwing exception - the exception will be quite confusing to the users. - Need to fix the style issues, findbugs etc. > "yarn logs" command should not require the appOwner argument > > > Key: YARN-4842 > URL: https://issues.apache.org/jira/browse/YARN-4842 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ram Venkatesh >Assignee: Xuan Gong > Attachments: YARN-4842.1.patch, YARN-4842.2.patch, YARN-4842.3.patch, > YARN-4842.4.patch > > > The yarn logs command is among the most common ways to troubleshoot yarn app > failures, especially by an admin. > Currently if you run the command as a user different from the job owner, the > command will fail with a subtle message that it could not find the app under > the running user's name. This can be confusing especially to new admins. > We can figure out the job owner from the app report returned by the RM or the > AHS, or, by looking for the app directory using a glob pattern, so in most > cases this error can be avoided. > Question - are there scenarios where users will still need to specify the > -appOwner option? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-2888) Corrective mechanisms for rebalancing NM container queues
[ https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2888: -- Attachment: YARN-2888.007.patch Fixing failed TestCase and some of the checkstyle warnings. > Corrective mechanisms for rebalancing NM container queues > - > > Key: YARN-2888 > URL: https://issues.apache.org/jira/browse/YARN-2888 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh > Attachments: YARN-2888-yarn-2877.001.patch, > YARN-2888-yarn-2877.002.patch, YARN-2888.003.patch, YARN-2888.004.patch, > YARN-2888.005.patch, YARN-2888.006.patch, YARN-2888.007.patch > > > Bad queuing decisions by the LocalRMs (e.g., due to the distributed nature of > the scheduling decisions or due to having a stale image of the system) may > lead to an imbalance in the waiting times of the NM container queues. This > can in turn have an impact in job execution times and cluster utilization. > To this end, we introduce corrective mechanisms that may remove (whenever > needed) container requests from overloaded queues, adding them to less-loaded > ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4844) Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4844: - Attachment: YARN-4844.6.patch Thanks for reviews from [~bibinchundatt]/[~vvasudev], addressed all comments and updated patch (ver.6) > Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource > > > Key: YARN-4844 > URL: https://issues.apache.org/jira/browse/YARN-4844 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-4844.1.patch, YARN-4844.2.patch, YARN-4844.3.patch, > YARN-4844.4.patch, YARN-4844.5.patch, YARN-4844.6.patch > > > We use int32 for memory now, if a cluster has 10k nodes, each node has 210G > memory, we will get a negative total cluster memory. > And another case that easier overflows int32 is: we added all pending > resources of running apps to cluster's total pending resources. If a > problematic app requires too much resources (let's say 1M+ containers, each > of them has 3G containers), int32 will be not enough. > Even if we can cap each app's pending request, we cannot handle the case that > there're many running apps, each of them has capped but still significant > numbers of pending resources. > So we may possibly need to add getMemoryLong/getVirtualCoreLong to > o.a.h.y.api.records.Resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4747) AHS error 500 due to NPE when container start event is missing
[ https://issues.apache.org/jira/browse/YARN-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274873#comment-15274873 ] Jason Lowe commented on YARN-4747: -- +1 lgtm. Committing this. > AHS error 500 due to NPE when container start event is missing > -- > > Key: YARN-4747 > URL: https://issues.apache.org/jira/browse/YARN-4747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.2 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-4747.01.patch > > > Saw an error 500 due to a NullPointerException caused by a missing host for > an AM container. Stacktrace to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5056) Expose demand at the queue level for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274870#comment-15274870 ] Hadoop QA commented on YARN-5056: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 29m 42s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 31s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 20s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.security.TestRMDelegati
[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE
[ https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274862#comment-15274862 ] Yongjun Zhang commented on YARN-5048: - Sure Jian, will soon. > DelegationTokenRenewer#skipTokenRenewal may throw NPE > -- > > Key: YARN-5048 > URL: https://issues.apache.org/jira/browse/YARN-5048 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-5048.1.patch, YARN-5048.2.patch > > > {{((Token)token).decodeIdentifier()}} may > throw NPE if RM does not have the corresponding toke kind class. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5002) getApplicationReport call may raise NPE for removed queues
[ https://issues.apache.org/jira/browse/YARN-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274860#comment-15274860 ] Wangda Tan commented on YARN-5002: -- Committed to trunk/branch-2 to handle the NPE issue. Thanks [~jianhe] working on the patch, thanks reviews from [~templedf]/[~kasha]/[~sunilg]! [~kasha], if you think we need to handle ACLs for removed queues better, could you file a separate JIRA to improve it (like store ACL info for queues)? > getApplicationReport call may raise NPE for removed queues > -- > > Key: YARN-5002 > URL: https://issues.apache.org/jira/browse/YARN-5002 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sumana Sathish >Assignee: Jian He >Priority: Critical > Attachments: YARN-5002.1.patch, YARN-5002.2.patch, YARN-5002.3.patch > > > getApplicationReport call may raise NPE > {code} > Exception in thread "main" java.lang.NullPointerException: > java.lang.NullPointerException > > org.apache.hadoop.yarn.server.resourcemanager.security.QueueACLsManager.checkAccess(QueueACLsManager.java:57) > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.checkAccess(ClientRMService.java:279) > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:760) > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:682) > > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplications(ApplicationClientProtocolPBServiceImpl.java:234) > > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:425) > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2268) > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2264) > java.security.AccessController.doPrivileged(Native Method) > javax.security.auth.Subject.doAs(Subject.java:422) > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1708) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2262) > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > java.lang.reflect.Constructor.newInstance(Constructor.java:423) > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107) > > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplications(ApplicationClientProtocolPBClientImpl.java:254) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > com.sun.proxy.$Proxy18.getApplications(Unknown Source) > > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplications(YarnClientImpl.java:479) > > org.apache.hadoop.mapred.ResourceMgrDelegate.getAllJobs(ResourceMgrDelegate.java:135) > org.apache.hadoop.mapred.YARNRunner.getAllJobs(YARNRunner.java:167) > org.apache.hadoop.mapreduce.Cluster.getAllJobStatuses(Cluster.java:294) > org.apache.hadoop.mapreduce.tools.CLI.listJobs(CLI.java:553) > org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:338) > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > org.apache.hadoop.mapred.JobClient.main(JobClient.java:1274) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5002) getApplicationReport call may raise NPE for removed queues
[ https://issues.apache.org/jira/browse/YARN-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-5002: - Summary: getApplicationReport call may raise NPE for removed queues (was: getApplicationReport call may raise NPE) > getApplicationReport call may raise NPE for removed queues > -- > > Key: YARN-5002 > URL: https://issues.apache.org/jira/browse/YARN-5002 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Sumana Sathish >Assignee: Jian He >Priority: Critical > Attachments: YARN-5002.1.patch, YARN-5002.2.patch, YARN-5002.3.patch > > > getApplicationReport call may raise NPE > {code} > Exception in thread "main" java.lang.NullPointerException: > java.lang.NullPointerException > > org.apache.hadoop.yarn.server.resourcemanager.security.QueueACLsManager.checkAccess(QueueACLsManager.java:57) > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.checkAccess(ClientRMService.java:279) > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:760) > > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:682) > > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplications(ApplicationClientProtocolPBServiceImpl.java:234) > > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:425) > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2268) > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2264) > java.security.AccessController.doPrivileged(Native Method) > javax.security.auth.Subject.doAs(Subject.java:422) > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1708) > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2262) > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > java.lang.reflect.Constructor.newInstance(Constructor.java:423) > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107) > > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplications(ApplicationClientProtocolPBClientImpl.java:254) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > com.sun.proxy.$Proxy18.getApplications(Unknown Source) > > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplications(YarnClientImpl.java:479) > > org.apache.hadoop.mapred.ResourceMgrDelegate.getAllJobs(ResourceMgrDelegate.java:135) > org.apache.hadoop.mapred.YARNRunner.getAllJobs(YARNRunner.java:167) > org.apache.hadoop.mapreduce.Cluster.getAllJobStatuses(Cluster.java:294) > org.apache.hadoop.mapreduce.tools.CLI.listJobs(CLI.java:553) > org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:338) > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > org.apache.hadoop.mapred.JobClient.main(JobClient.java:1274) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274847#comment-15274847 ] Daniel Zhi commented on YARN-4676: -- 8. The method "private int refreshNodes(int timeout)" inside RMAdminCLI is the client-side logic that periodically checks and finally forceful decommission nodes if timeout. With this JIRA, the same timeout is also handled by RM server-side where the nodes will be decommissioned automatically upon timeout. The extra gracePeriod (now reduced to 5 seconds) here is to wait a little longer so to avoid unnecessary double decommission upon timeout from the client-side. In your example where user ask for 20 seconds, RM server-side will decommissioned the node no later than 20 seconds, the client will then notice the node become DECOMMISSIONED and simply finish without forceful decommission. 25. I will leave the evaluation of "disallow and exit" under "work preserving restart" for proper decision makers to make separately. The code pre this patch already allow graceful decommission (from client side) regardless "work preserving restart". It's better that I don't prematurely change it now. It could be addressed in as part of separate JIRA early comments conclude. > Automatic and Asynchronous Decommissioning Nodes Status Tracking > > > Key: YARN-4676 > URL: https://issues.apache.org/jira/browse/YARN-4676 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Zhi >Assignee: Daniel Zhi > Labels: features > Attachments: GracefulDecommissionYarnNode.pdf, > GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, > YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, > YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, > YARN-4676.012.patch, YARN-4676.013.patch > > > YARN-4676 implements an automatic, asynchronous and flexible mechanism to > graceful decommission > YARN nodes. After user issues the refreshNodes request, ResourceManager > automatically evaluates > status of all affected nodes to kicks out decommission or recommission > actions. RM asynchronously > tracks container and application status related to DECOMMISSIONING nodes to > decommission the > nodes immediately after there are ready to be decommissioned. Decommissioning > timeout at individual > nodes granularity is supported and could be dynamically updated. The > mechanism naturally supports multiple > independent graceful decommissioning “sessions” where each one involves > different sets of nodes with > different timeout settings. Such support is ideal and necessary for graceful > decommission request issued > by external cluster management software instead of human. > DecommissioningNodeWatcher inside ResourceTrackingService tracks > DECOMMISSIONING nodes status automatically and asynchronously after > client/admin made the graceful decommission request. It tracks > DECOMMISSIONING nodes status to decide when, after all running containers on > the node have completed, will be transitioned into DECOMMISSIONED state. > NodesListManager detect and handle include and exclude list changes to kick > out decommission or recommission as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE
[ https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274839#comment-15274839 ] Jian He commented on YARN-5048: --- Thanks [~yzhangal], could you help committing this in ? > DelegationTokenRenewer#skipTokenRenewal may throw NPE > -- > > Key: YARN-5048 > URL: https://issues.apache.org/jira/browse/YARN-5048 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-5048.1.patch, YARN-5048.2.patch > > > {{((Token)token).decodeIdentifier()}} may > throw NPE if RM does not have the corresponding toke kind class. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5056) Expose demand at the queue level for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274788#comment-15274788 ] Hadoop QA commented on YARN-5056: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 29m 16s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 33s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 76m 10s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/h
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274711#comment-15274711 ] Hadoop QA commented on YARN-5045: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 15 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 38s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 1s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 29s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 11s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 11s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 3s {color} | {color:green} YARN-2928 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped branch modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 4s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 31s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 10s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patch modules with no Java source: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 16s {color} | {color:red} patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests no findbugs output file (hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 31s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 11s {color} | {color:green} hadoop-project in the patch pa
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274693#comment-15274693 ] Li Lu commented on YARN-5045: - Awesome, thanks! +1 for the latest patch. > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.02.patch, YARN-5045-YARN-2928.03.patch, > YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504): > {noformat} > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/metrics/MetricsServlet > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorIm
[jira] [Updated] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-5045: -- Attachment: YARN-5045-YARN-2928.03.patch Posted patch v.3. Thanks for the suggestion [~gtCarrera9]. I did one better in that the property is declared right next to the hbase version in hadoop-project/pom.xml so that they can be updated in one place. > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.02.patch, YARN-5045-YARN-2928.03.patch, > YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504): > {noformat} > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/metrics/MetricsServlet > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.region
[jira] [Updated] (YARN-5056) Expose demand at the queue level for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated YARN-5056: --- Attachment: (was: YARN-5056.patch) > Expose demand at the queue level for FairScheduler > -- > > Key: YARN-5056 > URL: https://issues.apache.org/jira/browse/YARN-5056 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.2 >Reporter: Daniel Weeks >Priority: Minor > Attachments: YARN-5056.patch > > > There currently isn't a way to determine where demand is coming from within > the scheduler. Issues like YARN-2408 proposed a more detailed way to expose > this information, but it's very easy to expose this at a queue level by > simply including the queue demand, which is a good indicator of where > resources requests are coming from (at least at the queue level). > This will help to better understand metrics like "containersPending" and > where resource requests are originating. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274670#comment-15274670 ] Li Lu commented on YARN-5045: - Thanks [~sjlee0] for fixing this problem! I checked locally and it works. One nit is that we may want to make the version number (2.5.1) of hadoop in hbase-test's pom to be a property (set in the same file)? Right now there are two occurrences and we may want to centralize them? Thanks! > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.02.patch, YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504): > {noformat} > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/metrics/MetricsServlet > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697)
[jira] [Updated] (YARN-5056) Expose demand at the queue level for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated YARN-5056: --- Attachment: YARN-5056.patch > Expose demand at the queue level for FairScheduler > -- > > Key: YARN-5056 > URL: https://issues.apache.org/jira/browse/YARN-5056 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.2 >Reporter: Daniel Weeks >Priority: Minor > Attachments: YARN-5056.patch > > > There currently isn't a way to determine where demand is coming from within > the scheduler. Issues like YARN-2408 proposed a more detailed way to expose > this information, but it's very easy to expose this at a queue level by > simply including the queue demand, which is a good indicator of where > resources requests are coming from (at least at the queue level). > This will help to better understand metrics like "containersPending" and > where resource requests are originating. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4280) CapacityScheduler reservations may not prevent indefinite postponement on a busy cluster
[ https://issues.apache.org/jira/browse/YARN-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274661#comment-15274661 ] Wangda Tan commented on YARN-4280: -- [~jlowe], thanks for sharing your thoughts, agree with your plan about this. > CapacityScheduler reservations may not prevent indefinite postponement on a > busy cluster > > > Key: YARN-4280 > URL: https://issues.apache.org/jira/browse/YARN-4280 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.6.1, 2.8.0, 2.7.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > > Consider the following scenario: > There are 2 queues A(25% of the total capacity) and B(75%), both can run at > total cluster capacity. There are 2 applications, appX that runs on Queue A, > always asking for 1G containers(non-AM) and appY runs on Queue B asking for 2 > GB containers. > The user limit is high enough for the application to reach 100% of the > cluster resource. > appX is running at total cluster capacity, full with 1G containers releasing > only one container at a time. appY comes in with a request of 2GB container > but only 1 GB is free. Ideally, since appY is in the underserved queue, it > has higher priority and should reserve for its 2 GB request. Since this > request puts the alloc+reserve above total capacity of the cluster, > reservation is not made. appX comes in with a 1GB request and since 1GB is > still available, the request is allocated. > This can continue indefinitely causing priority inversion. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5056) Expose demand at the queue level for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated YARN-5056: --- Attachment: YARN-5056.patch > Expose demand at the queue level for FairScheduler > -- > > Key: YARN-5056 > URL: https://issues.apache.org/jira/browse/YARN-5056 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.7.2 >Reporter: Daniel Weeks >Priority: Minor > Attachments: YARN-5056.patch > > > There currently isn't a way to determine where demand is coming from within > the scheduler. Issues like YARN-2408 proposed a more detailed way to expose > this information, but it's very easy to expose this at a queue level by > simply including the queue demand, which is a good indicator of where > resources requests are coming from (at least at the queue level). > This will help to better understand metrics like "containersPending" and > where resource requests are originating. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4747) AHS error 500 due to NPE when container start event is missing
[ https://issues.apache.org/jira/browse/YARN-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274612#comment-15274612 ] Hadoop QA commented on YARN-4747: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 44s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 54s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 29m 19s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 10s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 32s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 90m
[jira] [Created] (YARN-5056) Expose demand at the queue level for FairScheduler
Daniel Weeks created YARN-5056: -- Summary: Expose demand at the queue level for FairScheduler Key: YARN-5056 URL: https://issues.apache.org/jira/browse/YARN-5056 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.7.2 Reporter: Daniel Weeks Priority: Minor There currently isn't a way to determine where demand is coming from within the scheduler. Issues like YARN-2408 proposed a more detailed way to expose this information, but it's very easy to expose this at a queue level by simply including the queue demand, which is a good indicator of where resources requests are coming from (at least at the queue level). This will help to better understand metrics like "containersPending" and where resource requests are originating. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5055) max per user can be larger than max per queue
[ https://issues.apache.org/jira/browse/YARN-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned YARN-5055: - Assignee: Eric Badger > max per user can be larger than max per queue > - > > Key: YARN-5055 > URL: https://issues.apache.org/jira/browse/YARN-5055 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, resourcemanager >Affects Versions: 2.7.2 >Reporter: Jason Lowe >Assignee: Eric Badger >Priority: Minor > > If user limit and/or user limit factor are >100% then the calculated maximum > values per user can exceed the maximum for the queue. For example, maximum > AM resource per user could exceed maximum AM resource for the entire queue, > or max applications per user could be larger than max applications for the > queue. The per-user values should be capped by the per queue values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5055) max per user can be larger than max per queue
Jason Lowe created YARN-5055: Summary: max per user can be larger than max per queue Key: YARN-5055 URL: https://issues.apache.org/jira/browse/YARN-5055 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, resourcemanager Affects Versions: 2.7.2 Reporter: Jason Lowe Priority: Minor If user limit and/or user limit factor are >100% then the calculated maximum values per user can exceed the maximum for the queue. For example, maximum AM resource per user could exceed maximum AM resource for the entire queue, or max applications per user could be larger than max applications for the queue. The per-user values should be capped by the per queue values. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4982) Test timeout :yarn-client testcase timeout and failures
[ https://issues.apache.org/jira/browse/YARN-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274586#comment-15274586 ] John Zhuge commented on YARN-4982: -- [~bibinchundatt], in TestAMRMClient.java#732 of patch 0003: Why increase the timeout from 30s to 300s? > Test timeout :yarn-client testcase timeout and failures > --- > > Key: YARN-4982 > URL: https://issues.apache.org/jira/browse/YARN-4982 > Project: Hadoop YARN > Issue Type: Test > Components: test >Reporter: Bibin A Chundatt >Priority: Blocker > Attachments: 0001-YARN-4982.patch, 0002-YARN-4982.patch, > 0003-YARN-4982.patch > > > https://builds.apache.org/job/PreCommit-YARN-Build/11088/testReport/junit/org.apache.hadoop.yarn.client.api.impl/TestAMRMProxy/testAMRMProxyE2E/ > In hadoop-yarn-client package test {{TestAMRMProxy}} testcase timeout always > {noformat} > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:160) > at com.sun.proxy.$Proxy85.getNewApplication(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:227) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:235) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMProxy.createApp(TestAMRMProxy.java:367) > at > org.apache.hadoop.yarn.client.api.impl.TestAMRMProxy.testAMRMProxyE2E(TestAMRMProxy.java:110) > {noformat} > Other classes having similar failures > org.apache.hadoop.yarn.client.cli.TestYarnCLI > org.apache.hadoop.yarn.client.api.impl.TestYarnClient > org.apache.hadoop.yarn.client.api.impl.TestAMRMClient > org.apache.hadoop.yarn.client.api.impl.TestNMClien -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4351) Tests in h.y.c.TestGetGroups get failed on trunk
[ https://issues.apache.org/jira/browse/YARN-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274502#comment-15274502 ] John Zhuge commented on YARN-4351: -- Dup to YARN-4982? Its patch 0003 fixes TestGetGroups failure. I am not sure whether it is the same failure because the test report link in Description is no longer valid. > Tests in h.y.c.TestGetGroups get failed on trunk > > > Key: YARN-4351 > URL: https://issues.apache.org/jira/browse/YARN-4351 > Project: Hadoop YARN > Issue Type: Test > Components: test >Reporter: Junping Du > > From test report: > https://builds.apache.org/job/PreCommit-YARN-Build/9661/testReport/, we can > see there are several test failures for TestGetGroups. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5034) Failing tests after using try-with-resources
[ https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274481#comment-15274481 ] John Zhuge commented on YARN-5034: -- A dup of YARN-4982. To confirm, run the tests on a clean trunk. > Failing tests after using try-with-resources > > > Key: YARN-5034 > URL: https://issues.apache.org/jira/browse/YARN-5034 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Trivial > Fix For: 2.7.0 > > Attachments: YARN-5034.01.patch, YARN-5034.02.patch, > YARN-5034.03.patch, YARN-5034.04.patch, YARN-5034.05.patch, > YARN-5034.06.patch, YARN-5034.07.patch, YARN-5034.08.patch, > YARN-5034.09.patch, YARN-5034.10.patch, YARN-5034.11.patch, > YARN-5034.12.patch, YARN-5034.13.patch > > > This JIRA for following up failing tests. I am not able to reproduce locally > neither on mac nor CentOS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4913) Yarn logs should take a -out option to write to a directory
[ https://issues.apache.org/jira/browse/YARN-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274464#comment-15274464 ] Xuan Gong commented on YARN-4913: - Thanks for the explanation. Uploaded a new patch. This patch depends on YARN-4842 > Yarn logs should take a -out option to write to a directory > --- > > Key: YARN-4913 > URL: https://issues.apache.org/jira/browse/YARN-4913 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4913.1.patch, YARN-4913.2.patch, YARN-4913.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4913) Yarn logs should take a -out option to write to a directory
[ https://issues.apache.org/jira/browse/YARN-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4913: Attachment: YARN-4913.3.patch > Yarn logs should take a -out option to write to a directory > --- > > Key: YARN-4913 > URL: https://issues.apache.org/jira/browse/YARN-4913 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4913.1.patch, YARN-4913.2.patch, YARN-4913.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4994) Use MiniYARNCluster with try-with-resources in tests
[ https://issues.apache.org/jira/browse/YARN-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274437#comment-15274437 ] John Zhuge commented on YARN-4994: -- YARN-4982 Test timeout :yarn-client testcase timeout and failures" seems to have a fix for these failures. > Use MiniYARNCluster with try-with-resources in tests > > > Key: YARN-4994 > URL: https://issues.apache.org/jira/browse/YARN-4994 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Trivial > Fix For: 2.7.0 > > Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch, > HDFS-10287.03.patch > > > In tests MiniYARNCluster is used with the following pattern: > In try-catch block create a MiniYARNCluster instance and in finally block > close it. > [Try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html] > is preferred since Java7 instead of the pattern above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4994) Use MiniYARNCluster with try-with-resources in tests
[ https://issues.apache.org/jira/browse/YARN-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274420#comment-15274420 ] John Zhuge commented on YARN-4994: -- TestGetGroups is known to fail: YARN-4351. Tests in h.y.c.TestGetGroups get failed on trunk. > Use MiniYARNCluster with try-with-resources in tests > > > Key: YARN-4994 > URL: https://issues.apache.org/jira/browse/YARN-4994 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Trivial > Fix For: 2.7.0 > > Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch, > HDFS-10287.03.patch > > > In tests MiniYARNCluster is used with the following pattern: > In try-catch block create a MiniYARNCluster instance and in finally block > close it. > [Try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html] > is preferred since Java7 instead of the pattern above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274396#comment-15274396 ] Varun Saxena commented on YARN-5045: Thanks [~sjlee0] for the patch. Even I think mapreduce jars may never be used at runtime by our tests. My main concern was with hadoop-auth which I wasnt quite sure of. That you have anyways fixed for in v.2 version of the patch. The latest patch LGTM pending Jenkins. I will commit it later today. > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.02.patch, YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504): > {noformat} > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/metrics/MetricsServlet > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(
[jira] [Updated] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-5045: -- Attachment: YARN-5045-YARN-2928.02.patch Posted patch v.2. Added hadoop-auth as another artifact to enforce the (old) version. > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.02.patch, YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504): > {noformat} > Caused by: java.lang.NoClassDefFoundError: > org/apache/hadoop/metrics/MetricsServlet > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructo
[jira] [Commented] (YARN-2888) Corrective mechanisms for rebalancing NM container queues
[ https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274352#comment-15274352 ] Hadoop QA commented on YARN-2888: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 48s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 1s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 53s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in trunk has 3 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 48s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 45s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 45 new + 491 unchanged - 13 fixed = 536 total (was 504) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 38s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s {color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 24s {color}
[jira] [Resolved] (YARN-5054) Remove redundent TestMiniDFSCluster.testDualClusters
[ https://issues.apache.org/jira/browse/YARN-5054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge resolved YARN-5054. -- Resolution: Invalid Move to HDFS project. > Remove redundent TestMiniDFSCluster.testDualClusters > > > Key: YARN-5054 > URL: https://issues.apache.org/jira/browse/YARN-5054 > Project: Hadoop YARN > Issue Type: Test > Components: test >Affects Versions: 2.6.0 >Reporter: John Zhuge >Priority: Trivial > Labels: newbie > > Unit test {{TestMiniDFSCluster.testDualClusters}} is redundant because > {{testClusterWithoutSystemProperties}} already proves > {{cluster.getDataDirectory() == getProp(HDFS_MINIDFS_BASEDIR) + "/data"}}. > This unit test sets HDFS_MINIDFS_BASEDIR to 2 different values and brings up > 2 clusters, of course they will have different data directory. > Remove it to save the time to bring up 2 mini clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5054) Remove redundent TestMiniDFSCluster.testDualClusters
John Zhuge created YARN-5054: Summary: Remove redundent TestMiniDFSCluster.testDualClusters Key: YARN-5054 URL: https://issues.apache.org/jira/browse/YARN-5054 Project: Hadoop YARN Issue Type: Test Components: test Affects Versions: 2.6.0 Reporter: John Zhuge Priority: Trivial Unit test {{TestMiniDFSCluster.testDualClusters}} is redundant because {{testClusterWithoutSystemProperties}} already proves {{cluster.getDataDirectory() == getProp(HDFS_MINIDFS_BASEDIR) + "/data"}}. This unit test sets HDFS_MINIDFS_BASEDIR to 2 different values and brings up 2 clusters, of course they will have different data directory. Remove it to save the time to bring up 2 mini clusters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4994) Use MiniYARNCluster with try-with-resources in tests
[ https://issues.apache.org/jira/browse/YARN-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274332#comment-15274332 ] John Zhuge commented on YARN-4994: -- Could you rename the patches from HDFS-10287.* to YARN-4994.*? Let me try patch 03 again. > Use MiniYARNCluster with try-with-resources in tests > > > Key: YARN-4994 > URL: https://issues.apache.org/jira/browse/YARN-4994 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.7.0 >Reporter: Andras Bokor >Assignee: Andras Bokor >Priority: Trivial > Fix For: 2.7.0 > > Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch, > HDFS-10287.03.patch > > > In tests MiniYARNCluster is used with the following pattern: > In try-catch block create a MiniYARNCluster instance and in finally block > close it. > [Try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html] > is preferred since Java7 instead of the pattern above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5053) More informative diagnostics when applications killed by a user
[ https://issues.apache.org/jira/browse/YARN-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned YARN-5053: - Assignee: Eric Badger > More informative diagnostics when applications killed by a user > --- > > Key: YARN-5053 > URL: https://issues.apache.org/jira/browse/YARN-5053 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jason Lowe >Assignee: Eric Badger > > When an application kill request is processed by the ClientRMService it sets > the diagnostics to "Application killed by user". It would be nice to report > the user and host that issued the kill request in the app diagnostics so it > is clear where the kill originated. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5053) More informative diagnostics when applications killed by a user
Jason Lowe created YARN-5053: Summary: More informative diagnostics when applications killed by a user Key: YARN-5053 URL: https://issues.apache.org/jira/browse/YARN-5053 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Jason Lowe When an application kill request is processed by the ClientRMService it sets the diagnostics to "Application killed by user". It would be nice to report the user and host that issued the kill request in the app diagnostics so it is clear where the kill originated. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE
[ https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274281#comment-15274281 ] Yongjun Zhang commented on YARN-5048: - Thanks [~jianhe] for the rev. +1. > DelegationTokenRenewer#skipTokenRenewal may throw NPE > -- > > Key: YARN-5048 > URL: https://issues.apache.org/jira/browse/YARN-5048 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-5048.1.patch, YARN-5048.2.patch > > > {{((Token)token).decodeIdentifier()}} may > throw NPE if RM does not have the corresponding toke kind class. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274235#comment-15274235 ] Sangjin Lee commented on YARN-5045: --- Thanks for the comments [~varun_saxena]. I did notice those when I was working on the patch. First the yarn dependencies should be fine. HBase does not depend on yarn, and including yarn 3.0.0 dependencies is orthogonal. What matters more is the hdfs/common dependencies as they are shared between the timeline service unit tests and hbase. We could try to enforce 2.5.1 on mapreduce job client, but this has a potential of making the pom much bigger. Since all hadoop dependency versions are managed via dependency management (hadoop-project/pom.xml), we need to exclude it first, and declare a new dependency. But we may need to declare its dependencies too (as they are also managed, and without it 3.0.0 will be used, etc.). Mapreduce-client is near the top of the dependency chain, and the number of dependencies that needs to be replaced in this manner will be big. I haven't tried it, but that's my suspicion. My guess is that our unit tests are not exercising the mapreduce job client and that's probably why there are no issues. Let me know what you think. > hbase unit tests fail due to dependency issues > -- > > Key: YARN-5045 > URL: https://issues.apache.org/jira/browse/YARN-5045 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Blocker > Attachments: YARN-5045-YARN-2928.01.patch, > YARN-5045-YARN-2928.poc.patch > > > After the 5/4 rebase, the hbase unit tests in the timeline service project > are failing: > {noformat} > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage > Time elapsed: 5.103 sec <<< ERROR! > java.io.IOException: Shutting down > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > at > org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677) > at > org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500) > at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104) > at > org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345) > at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550) > at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217) > at > org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750) > at > org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87) > {noformat} > The root cause is that the hbase mini server depends on hadoop common's > {{MetricsSer
[jira] [Commented] (YARN-4325) Purge app state from NM state-store should cover more LOG_HANDLING cases
[ https://issues.apache.org/jira/browse/YARN-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274198#comment-15274198 ] Jason Lowe commented on YARN-4325: -- Thanks for the patch! For AppCompletelyDoneTransition it seems a little odd that we fire off a log event then remove the app that goes with that event. I would expect we would either wait for the log finished/failed event and cleanup in the FINISHED state like we normally do or not send the log event at all since we already know it is failed/done. Is the better fix to have the log aggregators properly return a log failed event in these cases? > Purge app state from NM state-store should cover more LOG_HANDLING cases > > > Key: YARN-4325 > URL: https://issues.apache.org/jira/browse/YARN-4325 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: ApplicationImpl.PNG, YARN-4325-v1.1.patch, > YARN-4325-v1.patch, YARN-4325.patch > > > From a long running cluster, we found tens of thousands of stale apps still > be recovered in NM restart recovery. > After investigating, there are three issues cause app state leak in NM > state-store: > 1. APPLICATION_LOG_HANDLING_FAILED is not handled with remove App in > NMStateStore. > 2. APPLICATION_LOG_HANDLING_FAILED event is missing in sent when hit > aggregator's doAppLogAggregation() exception case. > 3. Only Application in FINISHED status receiving APPLICATION_LOG_FINISHED has > transition to remove app in NM state store. Application in other status - > like APPLICATION_RESOURCES_CLEANUP will ignore the event and later forget to > remove this app from NM state store even after app get finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4325) Purge app state from NM state-store should cover more LOG_HANDLING cases
[ https://issues.apache.org/jira/browse/YARN-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274087#comment-15274087 ] Junping Du commented on YARN-4325: -- Can someone in watch list review patch here? Thanks! > Purge app state from NM state-store should cover more LOG_HANDLING cases > > > Key: YARN-4325 > URL: https://issues.apache.org/jira/browse/YARN-4325 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: ApplicationImpl.PNG, YARN-4325-v1.1.patch, > YARN-4325-v1.patch, YARN-4325.patch > > > From a long running cluster, we found tens of thousands of stale apps still > be recovered in NM restart recovery. > After investigating, there are three issues cause app state leak in NM > state-store: > 1. APPLICATION_LOG_HANDLING_FAILED is not handled with remove App in > NMStateStore. > 2. APPLICATION_LOG_HANDLING_FAILED event is missing in sent when hit > aggregator's doAppLogAggregation() exception case. > 3. Only Application in FINISHED status receiving APPLICATION_LOG_FINISHED has > transition to remove app in NM state store. Application in other status - > like APPLICATION_RESOURCES_CLEANUP will ignore the event and later forget to > remove this app from NM state store even after app get finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4325) Purge app state from NM state-store should cover more LOG_HANDLING cases
[ https://issues.apache.org/jira/browse/YARN-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274045#comment-15274045 ] Hadoop QA commented on YARN-4325: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 28s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 48s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 39s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802662/YARN-4325-v1.1.patch | | JIRA Issue | YARN-4325 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a233c450cb8c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2835f14 | | Default Java | 1.7.0_95 | | Multi-JDK ve
[jira] [Commented] (YARN-5020) Fix Documentation for Yarn Capacity Scheduler on Resource Calculator
[ https://issues.apache.org/jira/browse/YARN-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274060#comment-15274060 ] Hadoop QA commented on YARN-5020: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 7m 42s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12802665/YARN-5020.1.patch | | JIRA Issue | YARN-5020 | | Optional Tests | asflicense mvnsite | | uname | Linux bc7071962f0d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2835f14 | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/11362/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Fix Documentation for Yarn Capacity Scheduler on Resource Calculator > > > Key: YARN-5020 > URL: https://issues.apache.org/jira/browse/YARN-5020 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jo Desmet >Assignee: Takashi Ohnishi >Priority: Minor > Attachments: YARN-5020.1.patch > > > Documentation refers to 'DefaultResourseCalculator' - which is spelled > incorrectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5020) Fix Documentation for Yarn Capacity Scheduler on Resource Calculator
[ https://issues.apache.org/jira/browse/YARN-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takashi Ohnishi updated YARN-5020: -- Attachment: YARN-5020.1.patch > Fix Documentation for Yarn Capacity Scheduler on Resource Calculator > > > Key: YARN-5020 > URL: https://issues.apache.org/jira/browse/YARN-5020 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jo Desmet >Priority: Minor > Attachments: YARN-5020.1.patch > > > Documentation refers to 'DefaultResourseCalculator' - which is spelled > incorrectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5020) Fix Documentation for Yarn Capacity Scheduler on Resource Calculator
[ https://issues.apache.org/jira/browse/YARN-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273999#comment-15273999 ] Takashi Ohnishi commented on YARN-5020: --- Hi. I have created a patch for this typo. > Fix Documentation for Yarn Capacity Scheduler on Resource Calculator > > > Key: YARN-5020 > URL: https://issues.apache.org/jira/browse/YARN-5020 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jo Desmet >Priority: Minor > > Documentation refers to 'DefaultResourseCalculator' - which is spelled > incorrectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4325) Purge app state from NM state-store should cover more LOG_HANDLING cases
[ https://issues.apache.org/jira/browse/YARN-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4325: - Attachment: YARN-4325-v1.1.patch Update to v1.1 patch to fix javac, checkstyle and whitespace warning. > Purge app state from NM state-store should cover more LOG_HANDLING cases > > > Key: YARN-4325 > URL: https://issues.apache.org/jira/browse/YARN-4325 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: ApplicationImpl.PNG, YARN-4325-v1.1.patch, > YARN-4325-v1.patch, YARN-4325.patch > > > From a long running cluster, we found tens of thousands of stale apps still > be recovered in NM restart recovery. > After investigating, there are three issues cause app state leak in NM > state-store: > 1. APPLICATION_LOG_HANDLING_FAILED is not handled with remove App in > NMStateStore. > 2. APPLICATION_LOG_HANDLING_FAILED event is missing in sent when hit > aggregator's doAppLogAggregation() exception case. > 3. Only Application in FINISHED status receiving APPLICATION_LOG_FINISHED has > transition to remove app in NM state store. Application in other status - > like APPLICATION_RESOURCES_CLEANUP will ignore the event and later forget to > remove this app from NM state store even after app get finished. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting
[ https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273984#comment-15273984 ] Jason Lowe commented on YARN-5039: -- bq. scheduler will not assign containers to decommissioning nodes, that could be the reason why your applications stay at ACCEPTED state. When I saw those log messages I immediately thought that was the case, but I couldn't see any of the three completely empty nodes in the list of nodes that supposedly were decommissioning. In addition the debug logs clearly show the nodes are heartbeating in, the nodes page shows the RM thinks the nodes have 256GB available, and as Miles mentioned the nodes are immediately used when the second app's AM finally starts. Therefore I don't think this is related to node decommissioning unless the Amazon node decommissioning logic is very bizarre and somehow tied to when applications start. > Applications ACCEPTED but not starting > -- > > Key: YARN-5039 > URL: https://issues.apache.org/jira/browse/YARN-5039 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Miles Crawford > Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot > 2016-05-04 at 2.41.22 PM.png, queue-config.log, > resource-manager-application-starts.log.gz, > yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz > > > Often when we submit applications to an incompletely utilized cluster, they > sit, unable to start for no apparent reason. > There are multiple nodes in the cluster with available resources, but the > resourcemanger logs show that scheduling is being skipped. The scheduling is > skipped because the application itself has reserved the node? I'm not sure > how to interpret this log output: > {code} > 2016-05-04 20:19:21,315 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Trying to fulfill reservation for > application application_1462291866507_0025 on node: > ip-10-12-43-54.us-west-2.compute.internal:8041 > 2016-05-04 20:19:21,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > (ResourceManager Event Processor): Reserved container > application=application_1462291866507_0025 resource= > queue=default: capacity=1.0, absoluteCapacity=1.0, > usedResources=, usedCapacity=0.7126589, > absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 > usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster= > 2016-05-04 20:19:21,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Skipping scheduling since node > ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application > appattempt_1462291866507_0025_01 > 2016-05-04 20:19:22,232 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Trying to fulfill reservation for > application application_1462291866507_0025 on node: > ip-10-12-43-53.us-west-2.compute.internal:8041 > 2016-05-04 20:19:22,232 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > (ResourceManager Event Processor): Reserved container > application=application_1462291866507_0025 resource= > queue=default: capacity=1.0, absoluteCapacity=1.0, > usedResources=, usedCapacity=0.7126589, > absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 > usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster= > 2016-05-04 20:19:22,232 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Skipping scheduling since node > ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application > appattempt_1462291866507_0025_01 > 2016-05-04 20:19:22,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Trying to fulfill reservation for > application application_1462291866507_0025 on node: > ip-10-12-43-54.us-west-2.compute.internal:8041 > 2016-05-04 20:19:22,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > (ResourceManager Event Processor): Reserved container > application=application_1462291866507_0025 resource= > queue=default: capacity=1.0, absoluteCapacity=1.0, > usedResources=, usedCapacity=0.7126589, > absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 > usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster= > 2016-05-04 20:19:22,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Skipping schedu
[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting
[ https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273976#comment-15273976 ] Junping Du commented on YARN-5039: -- Hi [~milesc], DecommissioningNodesWatcher is a new class involving in YARN-4676 which is still in review process. Which branch your test is based on? > Applications ACCEPTED but not starting > -- > > Key: YARN-5039 > URL: https://issues.apache.org/jira/browse/YARN-5039 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Miles Crawford > Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot > 2016-05-04 at 2.41.22 PM.png, queue-config.log, > resource-manager-application-starts.log.gz, > yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz > > > Often when we submit applications to an incompletely utilized cluster, they > sit, unable to start for no apparent reason. > There are multiple nodes in the cluster with available resources, but the > resourcemanger logs show that scheduling is being skipped. The scheduling is > skipped because the application itself has reserved the node? I'm not sure > how to interpret this log output: > {code} > 2016-05-04 20:19:21,315 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Trying to fulfill reservation for > application application_1462291866507_0025 on node: > ip-10-12-43-54.us-west-2.compute.internal:8041 > 2016-05-04 20:19:21,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > (ResourceManager Event Processor): Reserved container > application=application_1462291866507_0025 resource= > queue=default: capacity=1.0, absoluteCapacity=1.0, > usedResources=, usedCapacity=0.7126589, > absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 > usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster= > 2016-05-04 20:19:21,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Skipping scheduling since node > ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application > appattempt_1462291866507_0025_01 > 2016-05-04 20:19:22,232 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Trying to fulfill reservation for > application application_1462291866507_0025 on node: > ip-10-12-43-53.us-west-2.compute.internal:8041 > 2016-05-04 20:19:22,232 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > (ResourceManager Event Processor): Reserved container > application=application_1462291866507_0025 resource= > queue=default: capacity=1.0, absoluteCapacity=1.0, > usedResources=, usedCapacity=0.7126589, > absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 > usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster= > 2016-05-04 20:19:22,232 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Skipping scheduling since node > ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application > appattempt_1462291866507_0025_01 > 2016-05-04 20:19:22,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Trying to fulfill reservation for > application application_1462291866507_0025 on node: > ip-10-12-43-54.us-west-2.compute.internal:8041 > 2016-05-04 20:19:22,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue > (ResourceManager Event Processor): Reserved container > application=application_1462291866507_0025 resource= > queue=default: capacity=1.0, absoluteCapacity=1.0, > usedResources=, usedCapacity=0.7126589, > absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 > usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster= > 2016-05-04 20:19:22,316 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler > (ResourceManager Event Processor): Skipping scheduling since node > ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application > appattempt_1462291866507_0025_01 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues
[ https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273960#comment-15273960 ] Varun Saxena commented on YARN-5045: [~sjlee0], thanks for the patch. The patch looks fine. I ran the test cases and they all passed. One question though. When I printed dependency tree for timelineservice hbase tests project, I still see some 3.0.0-SNAPSHOT jars being taken. Few anyways should be taken, such as hadoop-yarn-api, hadoop-yarn-common, etc. But some seem to be pulled unnecessarily/wrongly. For instance, hadoop-common is pulling in hadoop-auth 3.0.0-SNAPSHOT jar. Shouldnt this be 2.5.1 too ? To avoid later surprises. We can set it to 2.5.1 and exclude wherever 3.0.0-SNAPSHOT is being pulled in. However, it is possible hadoop-auth may not be used at runtime. Also, I see some mapreduce 3.0.0-SNAPSHOT jars being pulled via hbase dependencies. I am not sure how mapreduce jars will be useful in our test flow at runtime. Anyways should we enforce 2.5.1 version on them too. Or exclude them ? I did try excluding Mapreduce jars and setting hadoop-auth to 2.5.1. Tests passed even after these changes. {noformat} [INFO] [INFO] Building Apache Hadoop YARN Timeline Service HBase tests 3.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-dependency-plugin:2.2:tree (default-cli) @ hadoop-yarn-server-timelineservice-hbase-tests --- [INFO] org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-tests:jar:3.0.0-SNAPSHOT [INFO] +- junit:junit:jar:4.11:test [INFO] | \- org.hamcrest:hamcrest-core:jar:1.3:test [INFO] +- org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT:test [INFO] | +- org.apache.hadoop:hadoop-annotations:jar:3.0.0-SNAPSHOT:test [INFO] | | \- jdk.tools:jdk.tools:jar:1.7:system [INFO] | +- com.google.inject:guice:jar:3.0:test [INFO] | | +- javax.inject:javax.inject:jar:1:test [INFO] | | \- aopalliance:aopalliance:jar:1.0:test [INFO] | +- javax.servlet:servlet-api:jar:2.5:test [INFO] | +- javax.xml.bind:jaxb-api:jar:2.2.2:test [INFO] | | +- javax.xml.stream:stax-api:jar:1.0-2:test [INFO] | | \- javax.activation:activation:jar:1.1:test [INFO] | +- commons-cli:commons-cli:jar:1.2:test [INFO] | +- commons-lang:commons-lang:jar:2.6:test [INFO] | +- commons-logging:commons-logging:jar:1.1.3:test [INFO] | +- org.apache.commons:commons-csv:jar:1.0:test [INFO] | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:test [INFO] | \- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:test [INFO] +- org.apache.hadoop:hadoop-common:jar:2.5.1:test [INFO] | +- org.apache.commons:commons-math3:jar:3.1.1:test [INFO] | +- xmlenc:xmlenc:jar:0.52:test [INFO] | +- commons-httpclient:commons-httpclient:jar:3.1:test [INFO] | +- commons-codec:commons-codec:jar:1.4:test [INFO] | +- commons-net:commons-net:jar:3.1:test [INFO] | +- commons-collections:commons-collections:jar:3.2.2:test [INFO] | +- org.mortbay.jetty:jetty:jar:6.1.26:test [INFO] | +- org.mortbay.jetty:jetty-util:jar:6.1.26:test [INFO] | +- com.sun.jersey:jersey-json:jar:1.9:test [INFO] | | \- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:test [INFO] | +- com.sun.jersey:jersey-server:jar:1.9:test [INFO] | | \- asm:asm:jar:3.2:test (version managed from 3.1) [INFO] | +- tomcat:jasper-compiler:jar:5.5.23:test [INFO] | +- tomcat:jasper-runtime:jar:5.5.23:test [INFO] | +- javax.servlet.jsp:jsp-api:jar:2.1:test [INFO] | +- commons-el:commons-el:jar:1.0:test [INFO] | +- log4j:log4j:jar:1.2.17:test [INFO] | +- net.java.dev.jets3t:jets3t:jar:0.9.0:test [INFO] | | +- org.apache.httpcomponents:httpcore:jar:4.2.5:test [INFO] | | \- com.jamesmurty.utils:java-xmlbuilder:jar:0.4:test [INFO] | +- commons-configuration:commons-configuration:jar:1.6:test [INFO] | | +- commons-digester:commons-digester:jar:1.8:test [INFO] | | | \- commons-beanutils:commons-beanutils:jar:1.7.0:test [INFO] | | \- commons-beanutils:commons-beanutils-core:jar:1.8.0:test [INFO] | +- org.slf4j:slf4j-api:jar:1.7.10:test [INFO] | +- org.slf4j:slf4j-log4j12:jar:1.7.10:test [INFO] | +- org.apache.avro:avro:jar:1.7.4:test [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:test [INFO] | | \- org.xerial.snappy:snappy-java:jar:1.0.4.1:test [INFO] | +- com.google.protobuf:protobuf-java:jar:2.5.0:test [INFO] | +- org.apache.hadoop:hadoop-auth:jar:3.0.0-SNAPSHOT:test [INFO] | | +- com.nimbusds:nimbus-jose-jwt:jar:3.9:compile (scope managed from test) [INFO] | | | +- net.jcip:jcip-annotations:jar:1.0:compile [INFO] | | | +- net.minidev:json-smart:jar:1.1.1:compile [INFO] | | | \- commons-io:commons-io:jar:2.4:compile [INFO] | | +- org.apache.directory.server:apacheds-kerberos-codec:jar:2.0.0-M15:test [INFO] | | | +- org.apache.directory.server:apacheds-
[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273911#comment-15273911 ] Varun Vasudev commented on YARN-4676: - bq. 1. HostsFileReader currently allows multiple hosts per line. When hosts are pure digits, there will be ambiguity with timeout during interpretation. Likely allowing pure digit would requires pure-digit-host starts with a new line. Yep. The requirement for pure-digit hosts to start with a new line doesn't work because there might be users out there who are using the exclude feature(with numeric hostnames) and you'll end up breaking them. You'll have to come up with some other way to get the timeout information. What you could do is add support for either json/xml formatted files based on the filename suffix. Then you should be able to add the information you need and not worry about numeric hostname issue. bq. 2. -1 means infinite timeout (wait forever until ready). null means no overwrite, use the default timeout. Got it. Thank you the explanation. bq. 3. there could be large number of hosts to be decommissioned so the single line could be huge. grep a particular host would return a huge line in that case. A mix could be log in a single line for less than N host but otherwise multiple line. That said, I am ok to change to single line. Yeah but when we're tracking the changes to a node, it's much easier when grepping through the RM log. bq. 7. How about DEFAULT_NM_EXIT_WAIT_MS = 0? So that it could be customized in cases the delay is preferred. I'm still not convinced. From what I understand, the issue is that you have a supervisor script that is constantly restarting the NM if it shuts down. In the case of decommissioned nodes on EMR, this leads to NMs constantly coming up, connecting to the RM and shutting down. The timeout doesn't fix the problem - it just enforces a delayed shutdown for all errors(even if the node was shutdown due to a config problem for example). How about on exit, you write the reason for the exit to a well known location(like stdout/stderr or a named file). That way the supervisor script can look at the reason the for the exit and make a smarter decision on whether it should restart the NM and how long it should wait before re-starting the NM. bq. 8. The grace period is to give RM server-side a chance to DECOMMISSION the node should timeout reaches. A much smaller period like 2 seconds most likely would be sufficient as NodeManager heartbeat every second during which DECOMMISSIONING node will be re-evaluated and decommissioned if ready or timeout. Sorry I'm confused here - from my understanding of the code - if the user has asked for a 20 second timeout, you're internally treating that as a 40 second timeout. That's not the expected behaviour. Is my understanding wrong? {quote} 15. I agree that getDecommissioningStatus suggest the call is read-only. Since completed apps need to be take into account when evaluate readiness of the node, getDecommissioningStatus is actually a private method used internally so it could be changed into private checkDecommissioningStatus(nodeId). 22. the call simply returns if within 20 seconds of last call. Currently it lives inside ResourceTrackerService and uses rmContext. Alternatively DecommissioningNodesWatcher could be constructed with rmContext and internally has its own polling thread. Other than not sure yet the code pattern to use for such internal thread, it appears as valid alternative to me. {quote} I'm going to address both of these together because they're related in my mind. I think it's a cleaner solution to put the poll function in its own thread with its own timer than to call it for every node heartbeat. It does away with checks like last run time; you can make checkDecommissioningStatus(nodeId) a part of the poll function; it consolidates most of the de-commissioning logic instead of spreading it across the ResourceTrackerService and the DecommissioningWatcher; it also let's you increase/decrease the frequency of the poll by making it configurable(in the timer setting) instead of adding hard-coded numbers like 20 seconds in the code. bq. 9. "yarn rmadmin -refreshNodes -g -1" waits forever until the node is ready. "yarn rmadmin -refreshNodes -g" uses default timeout as specified by the configuration key. Thank you for the clarification. bq. 14. Here is an example of the tabular logging. Keeping DECOMMISSIONED node a little longer prevent it from suddenly disappeared from the list after DECOMMISSIONed. Sorry if I'm misunderstanding here. The use case is that you're grepping out the log lines from the decommissioned node watcher to determine when a node got decommissioned and keeping the node around for 20s longer ensures that the decommissioned state is logged by the DecommissioningWatcher. bq. 16. readDecommissioningT
[jira] [Created] (YARN-5052) Update timeline service v2 documentation to capture information about filters
Varun Saxena created YARN-5052: -- Summary: Update timeline service v2 documentation to capture information about filters Key: YARN-5052 URL: https://issues.apache.org/jira/browse/YARN-5052 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Since YARN-4447 has gone in, we can update our documentation to capture information about usage of filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273834#comment-15273834 ] tangshangwen commented on YARN-5051: I think should add NEW events in updateMetricsForRejoinedNode method processing,like this: {code:title=RMNodeImpl.java|borderStyle=solid} private void updateMetricsForRejoinedNode(NodeState previousNodeState) { ClusterMetrics metrics = ClusterMetrics.getMetrics(); metrics.incrNumActiveNodes(); switch (previousNodeState) { case LOST: metrics.decrNumLostNMs(); break; case REBOOTED: metrics.decrNumRebootedNMs(); break; case NEW: case DECOMMISSIONED: metrics.decrDecommisionedNMs(); break; case UNHEALTHY: metrics.decrNumUnhealthyNMs(); break; default: LOG.debug("Unexpected previous node state"); } } {code} > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes num can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4844) Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273827#comment-15273827 ] Varun Vasudev commented on YARN-4844: - Thanks for the patch [~leftnoteasy]. Couple of things - # The patch no longer applies after YARN-4390 # Can you please answer [~bibinchundatt]'s questions? The rest of the patch looks good to me. > Add getMemoryLong/getVirtualCoreLong to o.a.h.y.api.records.Resource > > > Key: YARN-4844 > URL: https://issues.apache.org/jira/browse/YARN-4844 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Attachments: YARN-4844.1.patch, YARN-4844.2.patch, YARN-4844.3.patch, > YARN-4844.4.patch, YARN-4844.5.patch > > > We use int32 for memory now, if a cluster has 10k nodes, each node has 210G > memory, we will get a negative total cluster memory. > And another case that easier overflows int32 is: we added all pending > resources of running apps to cluster's total pending resources. If a > problematic app requires too much resources (let's say 1M+ containers, each > of them has 3G containers), int32 will be not enough. > Even if we can cap each app's pending request, we cannot handle the case that > there're many running apps, each of them has capped but still significant > numbers of pending resources. > So we may possibly need to add getMemoryLong/getVirtualCoreLong to > o.a.h.y.api.records.Resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273825#comment-15273825 ] tangshangwen commented on YARN-5051: when the nodemanager start will trigger the AddNodeTransition, the node in the NEW state, will not reduce the value of DecommisionedNMs value. {code:title=RMNodeImpl.java|borderStyle=solid} Rpublic static class AddNodeTransition implements SingleArcTransition { @Override public void transition(RMNodeImpl rmNode, RMNodeEvent event) { // Inform the scheduler RMNodeStartedEvent startEvent = (RMNodeStartedEvent) event; List containers = null; String host = rmNode.nodeId.getHost(); if (rmNode.context.getInactiveRMNodes().containsKey(host)) { // Old node rejoining RMNode previouRMNode = rmNode.context.getInactiveRMNodes().get(host); rmNode.context.getInactiveRMNodes().remove(host); rmNode.updateMetricsForRejoinedNode(previouRMNode.getState()); } else { ClusterMetrics.getMetrics().incrNumActiveNodes(); {code} > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes num can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273819#comment-15273819 ] tangshangwen commented on YARN-5051: The include hosts file not empty also have the same problem > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes num can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes
[ https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4947: Labels: (was: 2.8-candidate) > Test timeout is happening for TestRMWebServicesNodes > > > Key: YARN-4947 > URL: https://issues.apache.org/jira/browse/YARN-4947 > Project: Hadoop YARN > Issue Type: Test > Components: test >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4947.patch, 0002-YARN-4947.patch, > 0003-YARN-4947.patch, 0004-YARN-4947.patch, 0005-YARN-4947.patch, > 0006-YARN-4947-rebase.patch, 0006-YARN-4947.patch, 0007-YARN-4947.patch, > YARN-4947-branch-2.8.007.patch > > > Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 > [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes
[ https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4947: Fix Version/s: (was: 2.9.0) 2.8.0 > Test timeout is happening for TestRMWebServicesNodes > > > Key: YARN-4947 > URL: https://issues.apache.org/jira/browse/YARN-4947 > Project: Hadoop YARN > Issue Type: Test > Components: test >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4947.patch, 0002-YARN-4947.patch, > 0003-YARN-4947.patch, 0004-YARN-4947.patch, 0005-YARN-4947.patch, > 0006-YARN-4947-rebase.patch, 0006-YARN-4947.patch, 0007-YARN-4947.patch, > YARN-4947-branch-2.8.007.patch > > > Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 > [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273815#comment-15273815 ] tangshangwen commented on YARN-5051: i think we should put the decommission node in InactiveRMNodes when it is registered {code:title=ResourceTrackerService.java|borderStyle=solid} RMNode rmNode = new RMNodeImpl(nodeId, rmContext, host, cmPort, httpPort, resolve(host), capability, nodeManagerVersion); // Check if this node is a 'valid' node if (!this.nodesListManager.isValidNode(host)) { String message = "Disallowed NodeManager from " + host + ", Sending SHUTDOWN signal to the NodeManager."; LOG.info(message); response.setDiagnosticsMessage(message); response.setNodeAction(NodeAction.SHUTDOWN); this.rmContext.getInactiveRMNodes().put(rmNode.getNodeID().getHost(), rmNode); return response; } {code} > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes num can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273813#comment-15273813 ] Daniel Templeton commented on YARN-5051: Is your include hosts file empty? > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes num can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tangshangwen updated YARN-5051: --- Description: When the RM restart,the RM will refuse the Decommission NodeManager register, and I put the NM host removed from exclude_mapred_host file, execute the command {noformat} yarn rmadmin -refreshNodes {noformat} start nodemanager , the decommissioned nodes num can not update was: When the RM restart,the RM will refuse the Decommission NodeManager register, and I put the NM host removed from exclude_mapred_host file, execute the command {noformat} yarn rmadmin -refreshNodes {noformat} start nodemanager , the decommissioned nodes can not update > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes num can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tangshangwen updated YARN-5051: --- Description: When the RM restart,the RM will refuse the Decommission NodeManager register, and I put the NM host removed from exclude_mapred_host file, execute the command {noformat} yarn rmadmin -refreshNodes {noformat} start nodemanager , the decommissioned nodes can not update was:When the RM restart,the RM will refuse the Decommission NodeManager register, and I put the NM host removed from exclude_mapred_host file, and start nodemanager , the decommissioned nodes can not update > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, execute the > command > {noformat} > yarn rmadmin -refreshNodes > {noformat} > start nodemanager , the decommissioned nodes can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
[ https://issues.apache.org/jira/browse/YARN-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tangshangwen updated YARN-5051: --- Attachment: rm.png > The RM can't update the Decommissioned Nodes Metric > --- > > Key: YARN-5051 > URL: https://issues.apache.org/jira/browse/YARN-5051 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.1 >Reporter: tangshangwen >Assignee: tangshangwen > Attachments: rm.png > > > When the RM restart,the RM will refuse the Decommission NodeManager register, > and I put the NM host removed from exclude_mapred_host file, and start > nodemanager , the decommissioned nodes can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5051) The RM can't update the Decommissioned Nodes Metric
tangshangwen created YARN-5051: -- Summary: The RM can't update the Decommissioned Nodes Metric Key: YARN-5051 URL: https://issues.apache.org/jira/browse/YARN-5051 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.1 Reporter: tangshangwen Assignee: tangshangwen When the RM restart,the RM will refuse the Decommission NodeManager register, and I put the NM host removed from exclude_mapred_host file, and start nodemanager , the decommissioned nodes can not update -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes
[ https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273780#comment-15273780 ] Sunil G commented on YARN-4947: --- +1 for 2.8 patch. It seems YARN-4807 has more depdency on other patches. I think we can get this in. > Test timeout is happening for TestRMWebServicesNodes > > > Key: YARN-4947 > URL: https://issues.apache.org/jira/browse/YARN-4947 > Project: Hadoop YARN > Issue Type: Test > Components: test >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Labels: 2.8-candidate > Fix For: 2.9.0 > > Attachments: 0001-YARN-4947.patch, 0002-YARN-4947.patch, > 0003-YARN-4947.patch, 0004-YARN-4947.patch, 0005-YARN-4947.patch, > 0006-YARN-4947-rebase.patch, 0006-YARN-4947.patch, 0007-YARN-4947.patch, > YARN-4947-branch-2.8.007.patch > > > Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 > [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE
[ https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273773#comment-15273773 ] Hadoop QA commented on YARN-5048: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 8s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 17s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 93m 17s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanag
[jira] [Updated] (YARN-2888) Corrective mechanisms for rebalancing NM container queues
[ https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2888: -- Attachment: YARN-2888.006.patch Thank you [~curino] for the detailed review.. apologize for the delay in addressing them.. bq. ..I see you add a few extra conf parameters. I was wondering whether we can come up with a better mechanism to configure policies, other than global conf parameters... If I understand your proposal correctly, I guess you are talking about making the configuration itself 'polymorphic' with respect to the policy.. If so, I totally agree with you, but I guess that deserves its own (umbrella?) JIRA to do it proper justice.. Thoughts ? w.r.t the parameters passed down in the {{NodeHeartBeatResponse}} to be more general than "queuLimit" / renaming the {{ContainerQueuingLimt}} to {{ContainerQueueingCommand}} .. Again.. great suggestion.. but given that currently, we only have a single 'command' being passed down, I did not want to increase the complexity of the patch. If you are fine with it, I can maybe raise a separate JIRA when we have atleast one other command that needs to be passed down from the RM. bq. in the .proto it would likely help other devs if you say max_wait_time_in_ms or something like that, which indicates time granularity. Also is int32 always enough? I've added the 'in_ms' suffix.. The upper limit of int32 expressed in ms is 24 days.. Given that this feature is targeted at short living tasks, I feel we can keep it as int32. bq. Is it reasonable to assume the caller of QueueLimitCalculator.update() will synchronize on topKNodes? I guess do...The QueueLimitCalculator was designed to be a helper class of NodeManagerQueueMonitor, and only the NMQM (since it is package private) can call update, which it does within a synchronized scope. bq. If topKNodes is << than total nodes, you could create a local list... I apologize for the confusion, the Calculator has to actually go thru ALL the nodes, not just the top K.. I have fixed this in the latest patch. With regard to the MEDIAN metric, I initially included it since it is less susceptible to major variations from outliers, but since we have a max and min, don't this it is required.. I have updated the patch to remove median. I have updated the patch with the rest of your suggestions. Do take a look at the latest patch, and let me know if you are fine with the changes. > Corrective mechanisms for rebalancing NM container queues > - > > Key: YARN-2888 > URL: https://issues.apache.org/jira/browse/YARN-2888 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh > Attachments: YARN-2888-yarn-2877.001.patch, > YARN-2888-yarn-2877.002.patch, YARN-2888.003.patch, YARN-2888.004.patch, > YARN-2888.005.patch, YARN-2888.006.patch > > > Bad queuing decisions by the LocalRMs (e.g., due to the distributed nature of > the scheduling decisions or due to having a stale image of the system) may > lead to an imbalance in the waiting times of the NM container queues. This > can in turn have an impact in job execution times and cluster utilization. > To this end, we introduce corrective mechanisms that may remove (whenever > needed) container requests from overloaded queues, adding them to less-loaded > ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org