[jira] [Commented] (YARN-4524) Cleanup AppSchedulingInfo
[ https://issues.apache.org/jira/browse/YARN-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075819#comment-15075819 ] Karthik Kambatla commented on YARN-4524: Thanks for the prompt review, Wangda. Appreciate it. > Cleanup AppSchedulingInfo > - > > Key: YARN-4524 > URL: https://issues.apache.org/jira/browse/YARN-4524 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.9.0 > > Attachments: yarn-4524-1.patch, yarn-4524-2.patch > > > The AppSchedulingInfo class has become very hard to grok with some pretty > long methods. It needs some cleaning up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075815#comment-15075815 ] Hadoop QA commented on YARN-4530: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 34s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 10s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 38s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12780090/YARN-4530.1.patch | | JIRA Issue | YARN-4530 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 8f9dafc8ed22 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchpr
[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075798#comment-15075798 ] Hadoop QA commented on YARN-4304: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} Patch generated 24 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 251, now 263). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 4 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 20s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 52s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 146m 48s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12780083/0008-YARN-4304.p
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075797#comment-15075797 ] Varun Saxena commented on YARN-4024: Sorry assigned JIRA to myself by mistake. Assigned it back to [~zhiguohong]. > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Hong Zhiguo > Fix For: 2.8.0 > > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4024: --- Assignee: Hong Zhiguo (was: Varun Saxena) > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Hong Zhiguo > Fix For: 2.8.0 > > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075794#comment-15075794 ] Rohith Sharma K S commented on YARN-4530: - +1 LGTM, pending jenkins > LocalizedResource trigger a NPE Cause the NodeManager exit > -- > > Key: YARN-4530 > URL: https://issues.apache.org/jira/browse/YARN-4530 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0, 2.7.1 >Reporter: tangshangwen > Attachments: YARN-4530.1.patch > > > In our cluster, I found that LocalizedResource download failed trigger a NPE > Cause the NodeManager shutdown. > {noformat} > 2015-12-29 17:18:33,706 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,708 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Downloading public rsrc:{ > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, > 1451380519635, FILE, null } > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Failed to download rsrc { { > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, > 1451380519452, FILE, null > },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} > java.io.IOException: Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > changed on src filesystem (expected 1451380519452, was 1451380611793 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,710 FATAL > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Error: Shutting down > java.lang.NullPointerException at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Public cache exiting > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-4024: -- Assignee: Varun Saxena (was: Hong Zhiguo) > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Varun Saxena > Fix For: 2.8.0 > > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4528) decreaseContainer Message maybe lost if NM restart
[ https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075772#comment-15075772 ] sandflee commented on YARN-4528: since in most cases container size is not changed, so I propose to pending container decrease msg. > decreaseContainer Message maybe lost if NM restart > -- > > Key: YARN-4528 > URL: https://issues.apache.org/jira/browse/YARN-4528 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee > > we may pending the container decrease msg util next heartbeat. or checks the > resource with rmContainer when node register. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tangshangwen updated YARN-4530: --- Attachment: YARN-4530.1.patch I found 2.7.1 have the same problem,I submitted a patch. > LocalizedResource trigger a NPE Cause the NodeManager exit > -- > > Key: YARN-4530 > URL: https://issues.apache.org/jira/browse/YARN-4530 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0, 2.7.1 >Reporter: tangshangwen > Attachments: YARN-4530.1.patch > > > In our cluster, I found that LocalizedResource download failed trigger a NPE > Cause the NodeManager shutdown. > {noformat} > 2015-12-29 17:18:33,706 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,708 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Downloading public rsrc:{ > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, > 1451380519635, FILE, null } > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Failed to download rsrc { { > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, > 1451380519452, FILE, null > },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} > java.io.IOException: Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > changed on src filesystem (expected 1451380519452, was 1451380611793 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,710 FATAL > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Error: Shutting down > java.lang.NullPointerException at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Public cache exiting > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tangshangwen updated YARN-4530: --- Affects Version/s: 2.7.1 > LocalizedResource trigger a NPE Cause the NodeManager exit > -- > > Key: YARN-4530 > URL: https://issues.apache.org/jira/browse/YARN-4530 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0, 2.7.1 >Reporter: tangshangwen > > In our cluster, I found that LocalizedResource download failed trigger a NPE > Cause the NodeManager shutdown. > {noformat} > 2015-12-29 17:18:33,706 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,708 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Downloading public rsrc:{ > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, > 1451380519635, FILE, null } > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Failed to download rsrc { { > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, > 1451380519452, FILE, null > },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} > java.io.IOException: Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > changed on src filesystem (expected 1451380519452, was 1451380611793 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,710 FATAL > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Error: Shutting down > java.lang.NullPointerException at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Public cache exiting > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4393) TestResourceLocalizationService#testFailedDirsResourceRelease fails intermittently
[ https://issues.apache.org/jira/browse/YARN-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075766#comment-15075766 ] Rohith Sharma K S commented on YARN-4393: - [~ozawa] would you have look at this please, do you have any comments? > TestResourceLocalizationService#testFailedDirsResourceRelease fails > intermittently > -- > > Key: YARN-4393 > URL: https://issues.apache.org/jira/browse/YARN-4393 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.3 > > Attachments: YARN-4393.01.patch > > > [~ozawa] pointed out this failure on YARN-4380. > Check > https://issues.apache.org/jira/browse/YARN-4380?focusedCommentId=15023773&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15023773 > {noformat} > sts run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.518 sec <<< > FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > testFailedDirsResourceRelease(org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService) > Time elapsed: 0.093 sec <<< FAILURE! > org.mockito.exceptions.verification.junit.ArgumentsAreDifferent: > Argument(s) are different! Wanted: > eventHandler.handle( > > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testFailedDirsResourceRelease(TestResourceLocalizationService.java:2632) > Actual invocation has different arguments: > eventHandler.handle( > EventType: APPLICATION_INITED > ); > -> at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testFailedDirsResourceRelease(TestResourceLocalizationService.java:2632) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3870) Providing raw container request information for fine scheduling
[ https://issues.apache.org/jira/browse/YARN-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075740#comment-15075740 ] Karthik Kambatla commented on YARN-3870: bq. I think this JIRA and YARN-4485 are different: bq. I think we cannot simply update timestamp when new resource request arrives. For example, at T1, AM asks 100 * 1G container; after 2 mins (T2), assume there's no container allocated, AM asks 100 * 1G container, we cannot say the resource request is added at T2. Instead, we should only set new timestamp for incremental asks For YARN-4485, I was planning on taking this exact approach. While the two JIRAs and their purposes are different, the ability to identify a set of requests that arrived at one point in time requires similar updates to the data structures we use in AppSchedulingInfo. bq. what do you think of using for id? Timestamps for Ids might not be a good idea especially when an AM can restart. Also, there might be merit to differentiating two ResourceRequests (say, at different priorities) received at the same time. Discussed this with [~asuresh] and [~subru] offline. We felt the following changes would help us address multiple JIRAs (as [~xinxianyin] listed): # Add an ID field to ResourceRequest - this can be a sequence number for each application. On AM restarts, a subsequent attempt could choose to resume from appropriate sequence number. If the AM doesn't add an ID, the RM could add one. Or, we could have the RM add the IDs and return them to the AM for help with book keeping. # YARN-4485 would likely want to add a timestamp in addition to this. Given the IDs, we likely don't have to do special delta handling. # In case the number of containers in the existing ResourceRequest increases, the delta is given a new ID. e.g - e.g. App increases request from 3 containers to 7 containers of same capability etc., the first three would have ID '1' and the next four would have ID '2'. # In case the number of containers corresponding to an existing ResourceRequest decreases, the number of containers is reduced from the largest ID to the smallest ID until the decrease is accounted for. e.g. If an app asks for 3, 7 and 2 containers in subsequent allocate calls, once these calls are processed, the app has 2 containers with ID '1'. # The resource-request data structure in AppSchedulingInfo will be this {{Map>>>}}. This would help YARN-314 as well. YARN-314 will need a few more changes to fix up the matching in each of the schedulers. # Note that we will still be expanding a ResourceRequest to node-local, rack-local and ANY requests. These would now be tied with an ID and hence can be updated correctly. If folks feel this would address all requirements, I could take a stab at the first patch. [~asuresh] and [~subru] have graciously offered to iterate on my prelim patch to fix up any issues in FairScheduler and CapacityScheduler. > Providing raw container request information for fine scheduling > --- > > Key: YARN-3870 > URL: https://issues.apache.org/jira/browse/YARN-3870 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, applications, capacityscheduler, fairscheduler, > resourcemanager, scheduler, yarn >Reporter: Lei Guo > > Currently, when AM sends container requests to RM and scheduler, it expands > individual container requests into host/rack/any format. For instance, if I > am asking for container request with preference "host1, host2, host3", > assuming all are in the same rack rack1, instead of sending one raw container > request to RM/Scheduler with raw preference list, it basically expand it to > become 5 different objects with host1, host2, host3, rack1 and any in there. > When scheduler receives information, it basically already lost the raw > request. This is ok for single container request, but it will cause trouble > when dealing with multiple container requests from the same application. > Consider this case: > 6 hosts, two racks: > rack1 (host1, host2, host3) rack2 (host4, host5, host6) > When application requests two containers with different data locality > preference: > c1: host1, host2, host4 > c2: host2, host3, host5 > This will end up with following container request list when client sending > request to RM/Scheduler: > host1: 1 instance > host2: 2 instances > host3: 1 instance > host4: 1 instance > host5: 1 instance > rack1: 2 instances > rack2: 2 instances > any: 2 instances > Fundamentally, it is hard for scheduler to make a right judgement without > knowing the raw container request. The situation will get worse when dealing > with affinity and anti-affinity or even gang scheduling etc. > We need some way to provide raw container request information for fine > scheduling purp
[jira] [Commented] (YARN-4352) Timeout for tests in TestYarnClient, TestAMRMClient and TestNMClient
[ https://issues.apache.org/jira/browse/YARN-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075737#comment-15075737 ] Rohith Sharma K S commented on YARN-4352: - Thanks for updating the patch, I am +1. And I would request any Hadoop common/hdfs expert to review the patch too > Timeout for tests in TestYarnClient, TestAMRMClient and TestNMClient > > > Key: YARN-4352 > URL: https://issues.apache.org/jira/browse/YARN-4352 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Junping Du >Assignee: Sunil G > Labels: security > Attachments: 0001-YARN-4352.patch, 0002-YARN-4352.patch > > > From > https://builds.apache.org/job/PreCommit-YARN-Build/9661/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client-jdk1.7.0_79.txt, > we can see the tests in TestYarnClient, TestAMRMClient and TestNMClient get > timeout which can be reproduced locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first
[ https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075733#comment-15075733 ] Karthik Kambatla commented on YARN-1014: bq. We were actually planning on getting the ContainersMonitor to kill/preempt opportunistic containers. Right. Same goes for YARN-1011 as well. This (configuring OOM Killer priority) is to address those cases where the reactive monitoring is too late to save processes from getting killed by the OOM Killer. > Configure OOM Killer to kill OPPORTUNISTIC containers first > --- > > Key: YARN-1014 > URL: https://issues.apache.org/jira/browse/YARN-1014 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Arun C Murthy >Assignee: Karthik Kambatla > > YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers > should be killed first should the system run out of memory. > - > Previous description: > Once RM allocates 'speculative containers' we need to get LCE to schedule > them at lower priorities via cgroups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075711#comment-15075711 ] tangshangwen commented on YARN-4530: I think I can fix it > LocalizedResource trigger a NPE Cause the NodeManager exit > -- > > Key: YARN-4530 > URL: https://issues.apache.org/jira/browse/YARN-4530 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: tangshangwen > > In our cluster, I found that LocalizedResource download failed trigger a NPE > Cause the NodeManager shutdown. > {noformat} > 2015-12-29 17:18:33,706 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,708 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Downloading public rsrc:{ > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, > 1451380519635, FILE, null } > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Failed to download rsrc { { > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, > 1451380519452, FILE, null > },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} > java.io.IOException: Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > changed on src filesystem (expected 1451380519452, was 1451380611793 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,710 FATAL > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Error: Shutting down > java.lang.NullPointerException at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Public cache exiting > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2575) Consider creating separate ACLs for Reservation create/update/delete/list ops
[ https://issues.apache.org/jira/browse/YARN-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-2575: -- Attachment: YARN-2575.v4.patch Thanks Subru, I addressed your comments in this latest patch. The one comment I wasn't sure about was: "We should return the default ACLs, i.e. everyone has access in ReservationSchedulerConfiguration::getReservationAcls" Do you mean that we should make it so that if no users are given in the conf, that we should allow everyone to have access? > Consider creating separate ACLs for Reservation create/update/delete/list ops > - > > Key: YARN-2575 > URL: https://issues.apache.org/jira/browse/YARN-2575 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Sean Po > Attachments: YARN-2575.v1.patch, YARN-2575.v2.1.patch, > YARN-2575.v2.patch, YARN-2575.v3.patch, YARN-2575.v4.patch > > > YARN-1051 introduces the ReservationSystem and in the current implementation > anyone who can submit applications can also submit reservations. This JIRA is > to evaluate creating separate ACLs for Reservation create/update/delete ops. > Depends on YARN-4340 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1382) Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get rid of memory leak
[ https://issues.apache.org/jira/browse/YARN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075705#comment-15075705 ] Rohith Sharma K S commented on YARN-1382: - thanks [~djp] for committing the patch:-) > Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get > rid of memory leak > - > > Key: YARN-1382 > URL: https://issues.apache.org/jira/browse/YARN-1382 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0, 2.7.1, 2.6.2 >Reporter: Alejandro Abdelnur >Assignee: Rohith Sharma K S > Fix For: 2.8.0 > > Attachments: 0001-YARN-1382.patch, 0002-YARN-1382.patch, > 0003-YARN-1382.patch > > > If a node is in the unusable nodes set (unusableRMNodesConcurrentSet) and > never comes back, the node will be there forever. > While the leak is not big, it gets aggravated if the NM addresses are > configured with ephemeral ports as when the nodes come back they come back as > new. > Some related details in YARN-1343 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: 0008-YARN-4304.patch Thank you [~leftnoteasy] for the comments. Addressing the first comment in new patch and also attaching screen shots. Kindly help to check the same. > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, 0007-YARN-4304.patch, > 0008-YARN-4304.patch, REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: REST_and_UI.zip > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, 0007-YARN-4304.patch, > REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: (was: REST_and_UI.zip) > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, 0007-YARN-4304.patch, > REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4529) Yarn CLI killing applications in batch
[ https://issues.apache.org/jira/browse/YARN-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075689#comment-15075689 ] Naganarasimha G R commented on YARN-4529: - IMO both are usefull and required, if i want to kill all accepted applications, i need not first get the list of accepted applications and then individually pass one by one (single or multiple params) in the CLI.. and vice versa i need not run the kill cli command multiple times if i have a bunch of application ids which i want to kill. Thoughts ? But at the same time i feel above 3 points which i have mentioned can also be considered, thoughts ? > Yarn CLI killing applications in batch > -- > > Key: YARN-4529 > URL: https://issues.apache.org/jira/browse/YARN-4529 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, client >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: YARN-4529.001.patch > > > We have not a good way to kill applications conveniently when starting some > apps unexpected. At present, we have to kill them one by one. We can add some > kill command that can kill apps in batch, like these: > {code} > -killByAppStatesThe states of application that will be killed. > -killByUser Kill running-state applications of specific > user. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075626#comment-15075626 ] Hong Zhiguo commented on YARN-4024: --- Thanks for your good point. Yes I can do it. Should I reopen this issue and post a new patch? > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Hong Zhiguo > Fix For: 2.8.0 > > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075619#comment-15075619 ] Ming Ma commented on YARN-4024: --- Thanks for the good improvement [~leftnoteasy], [~zhiguohong], [~sunilg], [~adhoot]! For cache timeout interval, should we change the semantics of -1 as "cache forever" and 0 as "no cache" to be more consistent with JVM setting "networkaddress.cache.ttl"? > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Hong Zhiguo > Fix For: 2.8.0 > > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4195) Support of node-labels in the ReservationSystem "Plan"
[ https://issues.apache.org/jira/browse/YARN-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075607#comment-15075607 ] Subru Krishnan commented on YARN-4195: -- Thanks [~curino] for the patch. I looked at it (excluding the parts that will be addressed by YARN-4476 and YARN-4523) and had a few thoughts on the API changes: * Can we use _Resource_ instead of in _RMNodeLabel_ *PlanEdit::setTotalCapacity*? * We should deprecate _getTotalCommittedResources_ from *PlanView* in favor of _getAvailableResources_ based on the enhancements that were made in YARN-4358 * I feel we should have a single API for _getTotalCapacity, getEarliestStartTime, getLastEndTime_ etc in *PlanView* which takes in a node label. We could have a reserved keyword say ALL or * to specify that we want it to be aggregated across all labels (inc NO_LABEL) Other than that, please find minor comments below: * In *AbstractReservationSystem::initializePlan* use the preinitialized _UTCClock_ instead of creating one for every _Plan_ * The check for _user_ in *InMemoryPlan::incrementAllocation* can be made outside the for loops * The check for _node labels_ in *InMemoryPlan::incrementAllocation* can be made outside the inner for loop * Looks like there are minor formatting issues in *PlanView* > Support of node-labels in the ReservationSystem "Plan" > -- > > Key: YARN-4195 > URL: https://issues.apache.org/jira/browse/YARN-4195 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4195.patch > > > As part of YARN-4193 we need to enhance the InMemoryPlan (and related > classes) to track the per-label available resources, as well as the per-label > reservation-allocations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2885) Create AMRMProxy request interceptor for distributed scheduling decisions for queueable containers
[ https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075600#comment-15075600 ] Arun Suresh commented on YARN-2885: --- Thanks for the review [~leftnoteasy], bq. Do you have real use case that distributed scheduler needs to set different properties such as DIST_SCHEDULING_MIN_MEMORY.. We had included this so that we can centrally (via the RM) control the max/min capability of a container that is allocatable via Distributed Scheduling.. which could be different from the Scheduler min/max. bq. First constructor of ApplicationMasterService, should use name .. Good catch.. will fix that bq. You can add a isDistributedSchedulingEnabled method to YarnConfiguration.. Actually, I had thought about that.. but if you notice, we are actually querying the configuration object only once in the serviceinit of the {{NodeManager}} and the value is passed on to all the other required classes via constructor etc.. So I did not find much value in adding that the YarnConfiguration. If you are ok with the rest of the patch and agree with the approach... I shall clean it up a bit more with proper javadocs etc. > Create AMRMProxy request interceptor for distributed scheduling decisions for > queueable containers > -- > > Key: YARN-2885 > URL: https://issues.apache.org/jira/browse/YARN-2885 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh > Attachments: YARN-2885-yarn-2877.001.patch, > YARN-2885-yarn-2877.002.patch, YARN-2885-yarn-2877.full-2.patch, > YARN-2885-yarn-2877.full-3.patch, YARN-2885-yarn-2877.full.patch, > YARN-2885-yarn-2877.v4.patch, YARN-2885_api_changes.patch > > > We propose to add a Local ResourceManager (LocalRM) to the NM in order to > support distributed scheduling decisions. > Architecturally we leverage the RMProxy, introduced in YARN-2884. > The LocalRM makes distributed decisions for queuable containers requests. > Guaranteed-start requests are still handled by the central RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4495) add a way to tell AM container increase/decrease request is invalid
[ https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075570#comment-15075570 ] sandflee commented on YARN-4495: thanks [~wangda], hoping more suggestions > add a way to tell AM container increase/decrease request is invalid > --- > > Key: YARN-4495 > URL: https://issues.apache.org/jira/browse/YARN-4495 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > Attachments: YARN-4495.01.patch > > > now RM may pass InvalidResourceRequestException to AM or just ignore the > change request, the former will cause AMRMClientAsync down. and the latter > will leave AM waiting for the relay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4510) Fix SLS startup failure caused by NPE
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075530#comment-15075530 ] Hudson commented on YARN-4510: -- FAILURE: Integrated in Hadoop-trunk-Commit #9041 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9041/]) YARN-4510. Fix SLS startup failure caused by NPE. (Bibin A Chundatt via (wangda: rev a9594c61bb8ca9e61e367988d3012a4615026090) * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java * hadoop-yarn-project/CHANGES.txt > Fix SLS startup failure caused by NPE > - > > Key: YARN-4510 > URL: https://issues.apache.org/jira/browse/YARN-4510 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.8.0 > > Attachments: 0001-YARN-4510.patch, 0002-YARN-4510.patch > > > Configure Fair scheduler in yarn site > Start SLS check cluster apps page > {noformat} > 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container > container_1451182067412_0002_01_000258 of capacity on > host a2117.smile.com:2, which has 10 containers, > used and available after allocation > 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) > at > org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > {noformat} > Confgure Capacity scheduler and SLS start up > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SLS failed to start -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4524) Cleanup AppSchedulingInfo
[ https://issues.apache.org/jira/browse/YARN-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075529#comment-15075529 ] Hudson commented on YARN-4524: -- FAILURE: Integrated in Hadoop-trunk-Commit #9041 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9041/]) YARN-4524. Cleanup AppSchedulingInfo. (Karthik Kambatla via wangda) (wangda: rev 4e4b3a8465a8433e78e015cb1ce7e0dc1ebeb523) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java > Cleanup AppSchedulingInfo > - > > Key: YARN-4524 > URL: https://issues.apache.org/jira/browse/YARN-4524 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Fix For: 2.9.0 > > Attachments: yarn-4524-1.patch, yarn-4524-2.patch > > > The AppSchedulingInfo class has become very hard to grok with some pretty > long methods. It needs some cleaning up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075525#comment-15075525 ] Hudson commented on YARN-4522: -- FAILURE: Integrated in Hadoop-trunk-Commit #9040 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9040/]) YARN-4522. Queue acl can be checked at app submission. (Jian He via (wangda: rev 8310b2e9ff3d6804bad703c4c15458b0dfeeb4af) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesAppsModification.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/AMSimulator.java > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Fix For: 2.9.0 > > Attachments: YARN-4522.1.patch, YARN-4522.2.patch, YARN-4522.3.patch, > YARN-4522.4.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4510) Fix SLS startup failure caused by NPE
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4510: - Summary: Fix SLS startup failure caused by NPE (was: SLS startup failure and webpage broken) > Fix SLS startup failure caused by NPE > - > > Key: YARN-4510 > URL: https://issues.apache.org/jira/browse/YARN-4510 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4510.patch, 0002-YARN-4510.patch > > > Configure Fair scheduler in yarn site > Start SLS check cluster apps page > {noformat} > 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container > container_1451182067412_0002_01_000258 of capacity on > host a2117.smile.com:2, which has 10 containers, > used and available after allocation > 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) > at > org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > {noformat} > Confgure Capacity scheduler and SLS start up > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SLS failed to start -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4516) [YARN-3368] Use em-table to better render tables
[ https://issues.apache.org/jira/browse/YARN-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075509#comment-15075509 ] Li Lu commented on YARN-4516: - [~Sreenath] Never mind, I figured out how to feed models to an em-table by myself. I'll try to integrate em-table as a part of our existing tables in components, starting with flow table in timeline v2. > [YARN-3368] Use em-table to better render tables > > > Key: YARN-4516 > URL: https://issues.apache.org/jira/browse/YARN-4516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Li Lu > > Currently we're using DataTables, it isn't integrated to Ember.js very well. > Instead we can use em-table (see https://github.com/sreenaths/em-table/wiki, > which is created for Tez UI). It supports features such as selectable > columns, pagination, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075505#comment-15075505 ] Hadoop QA commented on YARN-4522: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 53s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 37s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 46s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 44s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 20s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 15s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-sls in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 31s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hadoop-sls in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 178m 4
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075452#comment-15075452 ] Jian He commented on YARN-4479: --- sorry, I missed this part, the recovered app need to be respected first only for the LeafQueue#pendingOrderingPolicy, right ? for LeafQueue#orderingPolicy, this is not needed. bq. Reference test case TestRMRestart#testRMRestartAppRunningAMFailed I don't understand how this test case is related. > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, > 0003-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4172) Extend DominantResourceCalculator to account for all resources
[ https://issues.apache.org/jira/browse/YARN-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075384#comment-15075384 ] Wangda Tan commented on YARN-4172: -- Hi [~vvasudev], Patch generally looks good, is there any issue or concern with force push? I can help you to do force push if you want. > Extend DominantResourceCalculator to account for all resources > -- > > Key: YARN-4172 > URL: https://issues.apache.org/jira/browse/YARN-4172 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4172-YARN-3926.001.patch, > YARN-4172-YARN-3926.002.patch > > > Now that support for multiple resources is present in the resource class, we > need to modify DominantResourceCalculator to account for the new resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4495) add a way to tell AM container increase/decrease request is invalid
[ https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075380#comment-15075380 ] Wangda Tan commented on YARN-4495: -- Thanks for investigations, [~sandflee]. I feel we should make this JIRA to be more generic helpful: - Current any issue of regular resource request becomes InvalidResourceRequestException, but AM doesn't know which resource request is failed. Similar to change container resource request. - Some resource request / change container resource request could be rejected after it added to scheduler. (For example, queue's accessible-node-label changed could cause some original valid resource request changes to invalid.) - Maybe we can add a "RejectedResourceRequest" proto to AllocateResponse, and it contains list of rejected regular resource request, and list of increase/decrease container request. Since it is an API change, please expect that more time/discussion required to settle it down. Set target version to 2.9. > add a way to tell AM container increase/decrease request is invalid > --- > > Key: YARN-4495 > URL: https://issues.apache.org/jira/browse/YARN-4495 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > Attachments: YARN-4495.01.patch > > > now RM may pass InvalidResourceRequestException to AM or just ignore the > change request, the former will cause AMRMClientAsync down. and the latter > will leave AM waiting for the relay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4495) add a way to tell AM container increase/decrease request is invalid
[ https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4495: - Target Version/s: 2.9.0 > add a way to tell AM container increase/decrease request is invalid > --- > > Key: YARN-4495 > URL: https://issues.apache.org/jira/browse/YARN-4495 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > Attachments: YARN-4495.01.patch > > > now RM may pass InvalidResourceRequestException to AM or just ignore the > change request, the former will cause AMRMClientAsync down. and the latter > will leave AM waiting for the relay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075367#comment-15075367 ] Wangda Tan commented on YARN-4304: -- Hi [~sunilg], Thanks for replying, make sense to me. Is there any update for the REST response? Could you please upload new REST/screenshot if there's any changes of them? > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, 0007-YARN-4304.patch, > REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-4522: -- Attachment: YARN-4522.4.patch > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-4522.1.patch, YARN-4522.2.patch, YARN-4522.3.patch, > YARN-4522.4.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075330#comment-15075330 ] Wangda Tan commented on YARN-4510: -- Looks good, +1, thanks [~bibinchundatt]. > SLS startup failure and webpage broken > -- > > Key: YARN-4510 > URL: https://issues.apache.org/jira/browse/YARN-4510 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4510.patch, 0002-YARN-4510.patch > > > Configure Fair scheduler in yarn site > Start SLS check cluster apps page > {noformat} > 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container > container_1451182067412_0002_01_000258 of capacity on > host a2117.smile.com:2, which has 10 containers, > used and available after allocation > 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) > at > org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > {noformat} > Confgure Capacity scheduler and SLS start up > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SLS failed to start -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075331#comment-15075331 ] Hadoop QA commented on YARN-4522: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 22s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 22s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 29s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 27s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 24s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 26s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 59s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12780030/YARN-4522.3.patch | | JIRA Issue | YARN-4522 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 21e1a9739062 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT
[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first
[ https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075329#comment-15075329 ] Arun Suresh commented on YARN-1014: --- Thanks for taking this up [~kasha].. Id think this would be generally useful in trunk, but w.r.t YARN-2877, We were actually planning on getting the ContainersMonitor to kill/preempt opportunistic containers. This way, we can probably have more control over which containers are to be killed. [~kkaranasos], thoughts ? > Configure OOM Killer to kill OPPORTUNISTIC containers first > --- > > Key: YARN-1014 > URL: https://issues.apache.org/jira/browse/YARN-1014 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Arun C Murthy >Assignee: Karthik Kambatla > > YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers > should be killed first should the system run out of memory. > - > Previous description: > Once RM allocates 'speculative containers' we need to get LCE to schedule > them at lower priorities via cgroups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4526) Make SystemClock singleton so AppSchedulingInfo could use it
[ https://issues.apache.org/jira/browse/YARN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075322#comment-15075322 ] Arun Suresh commented on YARN-4526: --- Agreed, it does seem wasteful... Thanks for the patch [~kasha].. I was the just wondering though.. instead of passing the clock instance in the constructor and setting it to a field, why not do away with the {{this.clock}} field itself and just use {{SystemClock.getInstance()}} whenever you need it. I understand the constructor injection was probably put there for improved testability (I havnt seen it used though.. but i might be missing something). But wouldn't a {{SystemClock.setInstance(clock)}} with an *\@visibleForTesting* tag, which you set per testcase serve the same purpose ? > Make SystemClock singleton so AppSchedulingInfo could use it > > > Key: YARN-4526 > URL: https://issues.apache.org/jira/browse/YARN-4526 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4526-1.patch > > > To track the time a request is received, we need to get current system time. > For better testability of this, we are likely better off using a Clock > instance that uses SystemClock by default. Instead of creating umpteen > instances of SystemClock, we should just reuse the same instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075304#comment-15075304 ] Sunil G commented on YARN-4522: --- Thanks [~jianhe]. Latest patch looks fine. > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-4522.1.patch, YARN-4522.2.patch, YARN-4522.3.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-4522: -- Attachment: YARN-4522.3.patch > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-4522.1.patch, YARN-4522.2.patch, YARN-4522.3.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075296#comment-15075296 ] Jian He commented on YARN-4522: --- Hi [~sunilg], thanks for your review ! bq. Earlier in logs we could see AccessControlException, is this intentional? I changed to throw AccessControlException, which is more consistent with the remaining method in clientRMService. bq. earlier we were storing APP_REJECTED cases in StateStore Earlier app will be stored in RM's memory and store as failed, now the app will not be present in RM's memory because the call will fail upfront. This is what validateAndCreateResourceRequest does too, > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-4522.1.patch, YARN-4522.2.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4497) RM might fail to restart when recovering apps whose attempts are missing
[ https://issues.apache.org/jira/browse/YARN-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075257#comment-15075257 ] Hadoop QA commented on YARN-4497: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 241, now 241). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 42s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 13s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 146m 12s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12779998/YARN-4497.01
[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075232#comment-15075232 ] Sunil G commented on YARN-4304: --- Hi [~leftnoteasy] Thank you very much for sharing the comments. I have few doubts in same. bq.Could you update user am-limit (in User.resourceUsage) as well when computing user am-limit at activateApplications? We already doing this in activateApplications as below. {noformat} user.getResourceUsage().incAMUsed(partitionName, application.getAMResource(partitionName)); user.getResourceUsage().setAMLimit(partitionName, userAMLimit); {noformat} Now if we need to do this in {{getUserAMResourceLimitPerPartition}}, we need to pass the username as well. This user name comes from {{application}} {code} // Check user am resource limit User user = getUser(application.getUser()); {code} Hence, 1. We cannot pre-compute user am-limit like we have done for am-limit before pendingOrderingPolicy loop in {{activateApplications}} 2. So as we compute this limit every time, we store in {{user.getResourceUsage()}}. I thought of resusing this. However there can be cases where 0 user/1 users/Multiple users for one queue. So getting correct user is not really predictable for all getters (now we do not supply any user name in getAMUserLimit/Partition, even though we take first user, there can be cases where 0 users). Thoughts? bq. CapacitySchedulerPage, Instead of getAMResourceLimit() , Shouldn't you use getAMResourceLimit(partition)? {code} PartitionResourcesInfo resourceUsages = lqinfo.getResources().getPartitionResourceUsageInfo(label); // Get UserInfo from first user to calculate AM Resource Limit per user. ResourceInfo userAMResourceLimit = null; ArrayList usersList = lqinfo.getUsers().getUsersList(); if (usersList.isEmpty()) { // If no users are present, consider AM Limit for that queue. userAMResourceLimit = resourceUsages.getAMResourceLimit(); } {code} Here {{resourceUsages}} is already taken for specific label. Hence i think we do not need per label am-limit. > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, 0007-YARN-4304.patch, > REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4529) Yarn CLI killing applications in batch
[ https://issues.apache.org/jira/browse/YARN-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075228#comment-15075228 ] Hadoop QA commented on YARN-4529: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 9s {color} | {color:red} Patch generated 4 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client (total was 15, now 18). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 13s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 23s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 142m 33s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.client.TestGetGroups | | JDK v1.8.0_66 Timed out junit tests | org.apache.hadoop.yarn.client.cli.TestYarnCLI | | | org.apache.hadoop.yarn.client.api.impl.TestYarnClient | | | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient | | | org.apache.hadoop.yarn.client.api.impl.TestNMClient | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.client.TestGetGroups | | JDK v1.7.0_91 Timed out junit tests | org.apache.hadoop.yarn.client.cli.TestYarnCLI | | | org.apache.hadoop.yarn.client.api.impl.Te
[jira] [Commented] (YARN-4029) Update LogAggregationStatus to store on finish
[ https://issues.apache.org/jira/browse/YARN-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075192#comment-15075192 ] Hadoop QA commented on YARN-4029: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} | {color:red} YARN-4029 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12765996/0003-YARN-4029.patch | | JIRA Issue | YARN-4029 | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10128/console | This message was automatically generated. > Update LogAggregationStatus to store on finish > -- > > Key: YARN-4029 > URL: https://issues.apache.org/jira/browse/YARN-4029 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4029.patch, 0002-YARN-4029.patch, > 0003-YARN-4029.patch, Image.jpg > > > Currently the log aggregation status is not getting updated to Store. When RM > is restarted will show NOT_START. > Steps to reproduce > > 1.Submit mapreduce application > 2.Wait for completion > 3.Once application is completed switch RM > *Log Aggregation Status* are changing > *Log Aggregation Status* from SUCCESS to NOT_START -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075160#comment-15075160 ] Sunil G commented on YARN-4522: --- Also one more point, earlier we were storing APP_REJECTED cases in StateStore. Pls correct me if I am wrong. So now this will not be happening, correct? > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-4522.1.patch, YARN-4522.2.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4522) Queue acl can be checked at app submission
[ https://issues.apache.org/jira/browse/YARN-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075153#comment-15075153 ] Sunil G commented on YARN-4522: --- Hi [~jianhe] With this patch, we throw back {{YarnException}} now. Earlier in logs we could see {{AccessControlException}}, is this intentional? > Queue acl can be checked at app submission > -- > > Key: YARN-4522 > URL: https://issues.apache.org/jira/browse/YARN-4522 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Jian He > Attachments: YARN-4522.1.patch, YARN-4522.2.patch > > > Queue acl check is currently asynchronously done at > CapacityScheduler#addApplication, this could be done right at submission. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4029) Update LogAggregationStatus to store on finish
[ https://issues.apache.org/jira/browse/YARN-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075151#comment-15075151 ] Rohith Sharma K S commented on YARN-4029: - Thanks [~bibinchundatt] for providing patch, overall patch looks good. Some nits # LogAggStatusProto values can have without *LOGAGG_* so that PBImpl need not have logic to add prefix and replace it. > Update LogAggregationStatus to store on finish > -- > > Key: YARN-4029 > URL: https://issues.apache.org/jira/browse/YARN-4029 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4029.patch, 0002-YARN-4029.patch, > 0003-YARN-4029.patch, Image.jpg > > > Currently the log aggregation status is not getting updated to Store. When RM > is restarted will show NOT_START. > Steps to reproduce > > 1.Submit mapreduce application > 2.Wait for completion > 3.Once application is completed switch RM > *Log Aggregation Status* are changing > *Log Aggregation Status* from SUCCESS to NOT_START -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075149#comment-15075149 ] Sunil G commented on YARN-4479: --- Sorry I didnt meant FairScheduler, I was trying to mention {{FairOrderingPolicy}}. > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, > 0003-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4527) Possible thread leak if TimelineClient.start() get called multiple times.
[ https://issues.apache.org/jira/browse/YARN-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075148#comment-15075148 ] Hadoop QA commented on YARN-4527: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 39s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/1277/YARN-4527.patch | | JIRA Issue | YARN-4527 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 8609c60b0014 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personalit
[jira] [Commented] (YARN-1382) Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get rid of memory leak
[ https://issues.apache.org/jira/browse/YARN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075139#comment-15075139 ] Hudson commented on YARN-1382: -- FAILURE: Integrated in Hadoop-trunk-Commit #9038 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9038/]) YARN-1382. Remove unusableRMNodesConcurrentSet (never used) in (junping_du: rev 223ce323bb81463ec5c5ac7316738370d4a47366) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java * hadoop-yarn-project/CHANGES.txt > Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get > rid of memory leak > - > > Key: YARN-1382 > URL: https://issues.apache.org/jira/browse/YARN-1382 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0, 2.7.1, 2.6.2 >Reporter: Alejandro Abdelnur >Assignee: Rohith Sharma K S > Fix For: 2.8.0 > > Attachments: 0001-YARN-1382.patch, 0002-YARN-1382.patch, > 0003-YARN-1382.patch > > > If a node is in the unusable nodes set (unusableRMNodesConcurrentSet) and > never comes back, the node will be there forever. > While the leak is not big, it gets aggravated if the NM addresses are > configured with ephemeral ports as when the nodes come back they come back as > new. > Some related details in YARN-1343 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4506) Application was killed by a resourcemanager, In the JobHistory Can't see the job detail
[ https://issues.apache.org/jira/browse/YARN-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075140#comment-15075140 ] tangshangwen commented on YARN-4506: Ok, I'll try to fix it > Application was killed by a resourcemanager, In the JobHistory Can't see the > job detail > --- > > Key: YARN-4506 > URL: https://issues.apache.org/jira/browse/YARN-4506 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: tangshangwen > Attachments: am.rar > > > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a > signal. Signaling RMCommunicator and JobHistoryEventHandler. > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator > notified that iSignalled is: true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator > isAMLastRetry: true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator > notified that shouldUnregistered is: true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: > true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: > JobHistoryEventHandler notified that forceJobCompletion is true > 2015-12-15 03:08:54,074 INFO [Thread-1] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping > JobHistoryEventHandler. Size of the outstanding queue size is 0 > 2015-12-15 03:08:54,074 INFO [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: EventQueue > take interrupted. Returning > 2015-12-15 03:08:54,078 WARN [Thread-1] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Found jobId > job_1449835724839_219910 to have not been closed. Will close -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075138#comment-15075138 ] Sunil G commented on YARN-4479: --- HI [~rohithsharma] This new fix will also introduce RecoveryComparator to FairOrderingPolicy too. Is it needed? I think it can be tracked separate after checkin whether same pblm will arise with FairScheduler. > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, > 0003-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4506) Application was killed by a resourcemanager, In the JobHistory Can't see the job detail
[ https://issues.apache.org/jira/browse/YARN-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075135#comment-15075135 ] Junping Du commented on YARN-4506: -- 2.2 is too old. It is highly possible that we could already fix this issue in recent releases. Please check if latest release: 2.6.3 or 2.7.1 have the same issue. If not, let's resolve this JIRA as cannot reproduce. > Application was killed by a resourcemanager, In the JobHistory Can't see the > job detail > --- > > Key: YARN-4506 > URL: https://issues.apache.org/jira/browse/YARN-4506 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: tangshangwen > Attachments: am.rar > > > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: MRAppMaster received a > signal. Signaling RMCommunicator and JobHistoryEventHandler. > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator > notified that iSignalled is: true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator > isAMLastRetry: true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: RMCommunicator > notified that shouldUnregistered is: true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: > true > 2015-12-15 03:08:54,073 INFO [Thread-1] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: > JobHistoryEventHandler notified that forceJobCompletion is true > 2015-12-15 03:08:54,074 INFO [Thread-1] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping > JobHistoryEventHandler. Size of the outstanding queue size is 0 > 2015-12-15 03:08:54,074 INFO [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: EventQueue > take interrupted. Returning > 2015-12-15 03:08:54,078 WARN [Thread-1] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Found jobId > job_1449835724839_219910 to have not been closed. Will close -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4526) Make SystemClock singleton so AppSchedulingInfo could use it
[ https://issues.apache.org/jira/browse/YARN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075133#comment-15075133 ] Sunil G commented on YARN-4526: --- Hi [~ka...@cloudera.com] YARN-4403 introduced MonotonicClock and its available in util. So could we use that instead of SystemClock itself in ControlledClock. MAPREDUCE-6562 was trying to change MRApp to use MonotonicClock instead of SystemClock. > Make SystemClock singleton so AppSchedulingInfo could use it > > > Key: YARN-4526 > URL: https://issues.apache.org/jira/browse/YARN-4526 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4526-1.patch > > > To track the time a request is received, we need to get current system time. > For better testability of this, we are likely better off using a Clock > instance that uses SystemClock by default. Instead of creating umpteen > instances of SystemClock, we should just reuse the same instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4529) Yarn CLI killing applications in batch
[ https://issues.apache.org/jira/browse/YARN-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075127#comment-15075127 ] Sunil G commented on YARN-4529: --- Possibly a duplicate of YARN-4371. [~linyiqun], cud u pls check that once. I have provided a patch there which kills applications in batch. > Yarn CLI killing applications in batch > -- > > Key: YARN-4529 > URL: https://issues.apache.org/jira/browse/YARN-4529 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, client >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: YARN-4529.001.patch > > > We have not a good way to kill applications conveniently when starting some > apps unexpected. At present, we have to kill them one by one. We can add some > kill command that can kill apps in batch, like these: > {code} > -killByAppStatesThe states of application that will be killed. > -killByUser Kill running-state applications of specific > user. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1382) Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get rid of memory leak
[ https://issues.apache.org/jira/browse/YARN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1382: - Hadoop Flags: Reviewed > Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get > rid of memory leak > - > > Key: YARN-1382 > URL: https://issues.apache.org/jira/browse/YARN-1382 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0, 2.7.1, 2.6.2 >Reporter: Alejandro Abdelnur >Assignee: Rohith Sharma K S > Fix For: 2.8.0 > > Attachments: 0001-YARN-1382.patch, 0002-YARN-1382.patch, > 0003-YARN-1382.patch > > > If a node is in the unusable nodes set (unusableRMNodesConcurrentSet) and > never comes back, the node will be there forever. > While the leak is not big, it gets aggravated if the NM addresses are > configured with ephemeral ports as when the nodes come back they come back as > new. > Some related details in YARN-1343 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1382) Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get rid of memory leak
[ https://issues.apache.org/jira/browse/YARN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1382: - Summary: Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get rid of memory leak (was: NodeListManager has a memory leak, unusableRMNodesConcurrentSet is never purged) > Remove unusableRMNodesConcurrentSet (never used) in NodeListManager to get > rid of memory leak > - > > Key: YARN-1382 > URL: https://issues.apache.org/jira/browse/YARN-1382 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0, 2.7.1, 2.6.2 >Reporter: Alejandro Abdelnur >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-1382.patch, 0002-YARN-1382.patch, > 0003-YARN-1382.patch > > > If a node is in the unusable nodes set (unusableRMNodesConcurrentSet) and > never comes back, the node will be there forever. > While the leak is not big, it gets aggravated if the NM addresses are > configured with ephemeral ports as when the nodes come back they come back as > new. > Some related details in YARN-1343 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4497) RM might fail to restart when recovering apps whose attempts are missing
[ https://issues.apache.org/jira/browse/YARN-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075118#comment-15075118 ] Jun Gong commented on YARN-4497: In the patch, it deals with two cases: 1. attempt is missed When recovering a attempt, remove the attempt from app.attempts if we could not find corresponding ApplicationAttemptStateData from RMStateStore. There is no ApplicationAttemptStateData found, it means corresponding AM is never launched(launching AM is after receiving event *RMAppAttemptEventType.ATTEMPT_NEW_SAVED*, and we must not have received the event.). 2. attempt's final state is missed(fail to store its final state) When recovering these attempts, we set their state to FAILED(or any other final state, or adding a state UNKOWN if needed), then the attempt could deal well with event *RMAppAttemptEventType.RECOVER*. > RM might fail to restart when recovering apps whose attempts are missing > > > Key: YARN-4497 > URL: https://issues.apache.org/jira/browse/YARN-4497 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jun Gong >Assignee: Jun Gong > Attachments: YARN-4497.01.patch > > > Find following problem when discussing in YARN-3480. > If RM fails to store some attempts in RMStateStore, there will be missing > attempts in RMStateStore, for the case storing attempt1, attempt2 and > attempt3, RM successfully stored attempt1 and attempt3, but failed to store > attempt2. When RM restarts, in *RMAppImpl#recover*, we recover attempts one > by one, for this case, we will recover attmept1, then attempt2. When > recovering attempt2, we call > *((RMAppAttemptImpl)this.currentAttempt).recover(state)*, it will first find > its ApplicationAttemptStateData, but it could not find it, an error will come > at *assert attemptState != null*(*RMAppAttemptImpl#recover*, line 880). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4527) Possible thread leak if TimelineClient.start() get called multiple times.
[ https://issues.apache.org/jira/browse/YARN-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4527: - Attachment: YARN-4527.patch Upload a quick patch to fix it. It is quite straightforward so no unit test is needed. > Possible thread leak if TimelineClient.start() get called multiple times. > - > > Key: YARN-4527 > URL: https://issues.apache.org/jira/browse/YARN-4527 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-4527.patch > > > Since YARN-4234, we involve TimelineClient start and stop that would create > different TimelineWriter according to the configuration. serviceStart() will > create a TimelineWriter instance every time which will spawn several timer > threads afterwards. If one TimelineClient get call start() multiple times for > some reason (application bug or intentionally in some cases), the spawned > timer threads will get leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4516) [YARN-3368] Use em-table to better render tables
[ https://issues.apache.org/jira/browse/YARN-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075107#comment-15075107 ] Sreenath Somarajapuram commented on YARN-4516: -- You can pass any ember array as rows to the table. Please share a patch to get a better idea about the current implementation. Will add a sample to the dummy app soon. > [YARN-3368] Use em-table to better render tables > > > Key: YARN-4516 > URL: https://issues.apache.org/jira/browse/YARN-4516 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Li Lu > > Currently we're using DataTables, it isn't integrated to Ember.js very well. > Instead we can use em-table (see https://github.com/sreenaths/em-table/wiki, > which is created for Tez UI). It supports features such as selectable > columns, pagination, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4497) RM might fail to restart when recovering apps whose attempts are missing
[ https://issues.apache.org/jira/browse/YARN-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-4497: --- Attachment: YARN-4497.01.patch > RM might fail to restart when recovering apps whose attempts are missing > > > Key: YARN-4497 > URL: https://issues.apache.org/jira/browse/YARN-4497 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jun Gong >Assignee: Jun Gong > Attachments: YARN-4497.01.patch > > > Find following problem when discussing in YARN-3480. > If RM fails to store some attempts in RMStateStore, there will be missing > attempts in RMStateStore, for the case storing attempt1, attempt2 and > attempt3, RM successfully stored attempt1 and attempt3, but failed to store > attempt2. When RM restarts, in *RMAppImpl#recover*, we recover attempts one > by one, for this case, we will recover attmept1, then attempt2. When > recovering attempt2, we call > *((RMAppAttemptImpl)this.currentAttempt).recover(state)*, it will first find > its ApplicationAttemptStateData, but it could not find it, an error will come > at *assert attemptState != null*(*RMAppAttemptImpl#recover*, line 880). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
[ https://issues.apache.org/jira/browse/YARN-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075106#comment-15075106 ] tangshangwen commented on YARN-4530: when the assoc is null and the completed.get() throw a ExecutionException,This will happen, right? {code:title=ResourceLocalizationService.java|borderStyle=solid} try { Future completed = queue.take(); LocalizerResourceRequestEvent assoc = pending.remove(completed); try { Path local = completed.get(); if (null == assoc) { LOG.error("Localized unkonwn resource to " + completed); // TODO delete return; } LocalResourceRequest key = assoc.getResource().getRequest(); publicRsrc.handle(new ResourceLocalizedEvent(key, local, FileUtil .getDU(new File(local.toUri(); assoc.getResource().unlock(); } catch (ExecutionException e) { LOG.info("Failed to download rsrc " + assoc.getResource(), e.getCause()); LocalResourceRequest req = assoc.getResource().getRequest(); publicRsrc.handle(new ResourceFailedLocalizationEvent(req, e.getMessage())); assoc.getResource().unlock(); } catch (CancellationException e) { // ignore; shutting down } {code} > LocalizedResource trigger a NPE Cause the NodeManager exit > -- > > Key: YARN-4530 > URL: https://issues.apache.org/jira/browse/YARN-4530 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: tangshangwen > > In our cluster, I found that LocalizedResource download failed trigger a NPE > Cause the NodeManager shutdown. > {noformat} > 2015-12-29 17:18:33,706 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,708 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Downloading public rsrc:{ > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, > 1451380519635, FILE, null } > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Failed to download rsrc { { > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, > 1451380519452, FILE, null > },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} > java.io.IOException: Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > changed on src filesystem (expected 1451380519452, was 1451380611793 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: > Resource > hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar > transitioned from DOWNLOADING to FAILED > 2015-12-29 17:18:33,710 FATAL > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Error: Shutting down > java.lang.NullPointerException at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) > 2015-12-29 17:18:33,710 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: > Public cache exiting > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4530) LocalizedResource trigger a NPE Cause the NodeManager exit
tangshangwen created YARN-4530: -- Summary: LocalizedResource trigger a NPE Cause the NodeManager exit Key: YARN-4530 URL: https://issues.apache.org/jira/browse/YARN-4530 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0 Reporter: tangshangwen In our cluster, I found that LocalizedResource download failed trigger a NPE Cause the NodeManager shutdown. {noformat} 2015-12-29 17:18:33,706 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://ns3:8020/user/username/projects/user_insight/lookalike/oozie/workflow/conf/hive-site.xml transitioned from DOWNLOADING to FAILED 2015-12-29 17:18:33,708 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/user_insight_pig_udf-0.0.1-SNAPSHOT-jar-with-dependencies.jar, 1451380519635, FILE, null } 2015-12-29 17:18:33,710 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Failed to download rsrc { { hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar, 1451380519452, FILE, null },pending,[(container_1451039893865_261670_01_000578)],42332661980495938,DOWNLOADING} java.io.IOException: Resource hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar changed on src filesystem (expected 1451380519452, was 1451380611793 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:276) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-12-29 17:18:33,710 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource hdfs://ns3/user/username/projects/user_insight/lookalike/oozie/workflow/lib/unilever_support_udf-0.0.1-SNAPSHOT.jar transitioned from DOWNLOADING to FAILED 2015-12-29 17:18:33,710 FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Error: Shutting down java.lang.NullPointerException at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712) 2015-12-29 17:18:33,710 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4529) Yarn CLI killing applications in batch
[ https://issues.apache.org/jira/browse/YARN-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075088#comment-15075088 ] Naganarasimha G R commented on YARN-4529: - Hi [~linyiqun], Seems like a usefull feature, but few thoughts/queries, * this is done as part of Application CLI, but a given user might not have rights to all apps so would it be better to keep this as part of admin cli ? * Whether its good to fetch all the app reports to client and then kill, instead would it be better to get it done in the server ? so that REST can also benifit of it ? * Whether it would be good to supply a queue name ? and kill all the applications of it and its children ? > Yarn CLI killing applications in batch > -- > > Key: YARN-4529 > URL: https://issues.apache.org/jira/browse/YARN-4529 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, client >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: YARN-4529.001.patch > > > We have not a good way to kill applications conveniently when starting some > apps unexpected. At present, we have to kill them one by one. We can add some > kill command that can kill apps in batch, like these: > {code} > -killByAppStatesThe states of application that will be killed. > -killByUser Kill running-state applications of specific > user. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4529) Yarn CLI killing applications in batch
[ https://issues.apache.org/jira/browse/YARN-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated YARN-4529: Attachment: YARN-4529.001.patch > Yarn CLI killing applications in batch > -- > > Key: YARN-4529 > URL: https://issues.apache.org/jira/browse/YARN-4529 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, client >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: YARN-4529.001.patch > > > We have not a good way to kill applications conveniently when starting some > apps unexpected. At present, we have to kill them one by one. We can add some > kill command that can kill apps in batch, like these: > {code} > -killByAppStatesThe states of application that will be killed. > -killByUser Kill running-state applications of specific > user. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4529) Yarn CLI killing applications in batch
Lin Yiqun created YARN-4529: --- Summary: Yarn CLI killing applications in batch Key: YARN-4529 URL: https://issues.apache.org/jira/browse/YARN-4529 Project: Hadoop YARN Issue Type: Improvement Components: applications, client Affects Versions: 2.7.1 Reporter: Lin Yiqun Assignee: Lin Yiqun We have not a good way to kill applications conveniently when starting some apps unexpected. At present, we have to kill them one by one. We can add some kill command that can kill apps in batch, like these: {code} -killByAppStatesThe states of application that will be killed. -killByUser Kill running-state applications of specific user. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4528) decreaseContainer Message maybe lost if NM restart
[ https://issues.apache.org/jira/browse/YARN-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-4528: --- Summary: decreaseContainer Message maybe lost if NM restart (was: decreaseConainer Message maybe lost if NM restart) > decreaseContainer Message maybe lost if NM restart > -- > > Key: YARN-4528 > URL: https://issues.apache.org/jira/browse/YARN-4528 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee > > we may pending the container decrease msg util next heartbeat. or checks the > resource with rmContainer when node register. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4528) decreaseConainer Message maybe lost if NM restart
sandflee created YARN-4528: -- Summary: decreaseConainer Message maybe lost if NM restart Key: YARN-4528 URL: https://issues.apache.org/jira/browse/YARN-4528 Project: Hadoop YARN Issue Type: Bug Reporter: sandflee we may pending the container decrease msg util next heartbeat. or checks the resource with rmContainer when node register. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4527) Possible thread leak if TimelineClient.start() get called multiple times.
Junping Du created YARN-4527: Summary: Possible thread leak if TimelineClient.start() get called multiple times. Key: YARN-4527 URL: https://issues.apache.org/jira/browse/YARN-4527 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.8.0 Reporter: Junping Du Assignee: Junping Du Since YARN-4234, we involve TimelineClient start and stop that would create different TimelineWriter according to the configuration. serviceStart() will create a TimelineWriter instance every time which will spawn several timer threads afterwards. If one TimelineClient get call start() multiple times for some reason (application bug or intentionally in some cases), the spawned timer threads will get leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075069#comment-15075069 ] Hadoop QA commented on YARN-4479: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 7 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 274, now 275). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager introduced 1 new FindBugs issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 50s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 34s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 145m 43s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.PriorityComparator implements Comparator but not Serializable At PriorityComparator.java:Serializable At PriorityComparator.java:[lines 26-34] | | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.
[jira] [Commented] (YARN-4495) add a way to tell AM container increase/decrease request is invalid
[ https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075043#comment-15075043 ] sandflee commented on YARN-4495: Hi, [~leftnoteasy], I do a simple test throwing a exception with applicationId in ApplicationMasterService#allocate, and couldn't get applicationId in rpc client. I read the code, and find that exception is put in rpcHeader only with exceptionName and stackTrack info. {code:title=RpcHeader.proto} message RpcResponseHeaderProto { required uint32 callId = 1; // callId used in Request required RpcStatusProto status = 2; optional uint32 serverIpcVersionNum = 3; // Sent if success or fail optional string exceptionClassName = 4; // if request fails optional string errorMsg = 5; // if request fails, often contains strack trace optional RpcErrorCodeProto errorDetail = 6; // in case of error optional bytes clientId = 7; // Globally unique client ID optional sint32 retryCount = 8 [default = -1]; } {code} > add a way to tell AM container increase/decrease request is invalid > --- > > Key: YARN-4495 > URL: https://issues.apache.org/jira/browse/YARN-4495 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > Attachments: YARN-4495.01.patch > > > now RM may pass InvalidResourceRequestException to AM or just ignore the > change request, the former will cause AMRMClientAsync down. and the latter > will leave AM waiting for the relay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1382) NodeListManager has a memory leak, unusableRMNodesConcurrentSet is never purged
[ https://issues.apache.org/jira/browse/YARN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075017#comment-15075017 ] Junping Du commented on YARN-1382: -- +1. Latest patch LGTM. Committing it in. > NodeListManager has a memory leak, unusableRMNodesConcurrentSet is never > purged > --- > > Key: YARN-1382 > URL: https://issues.apache.org/jira/browse/YARN-1382 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.2.0, 2.7.1, 2.6.2 >Reporter: Alejandro Abdelnur >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-1382.patch, 0002-YARN-1382.patch, > 0003-YARN-1382.patch > > > If a node is in the unusable nodes set (unusableRMNodesConcurrentSet) and > never comes back, the node will be there forever. > While the leak is not big, it gets aggravated if the NM addresses are > configured with ephemeral ports as when the nodes come back they come back as > new. > Some related details in YARN-1343 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4232) TopCLI console support for HA mode
[ https://issues.apache.org/jira/browse/YARN-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4232: --- Issue Type: Improvement (was: Bug) Summary: TopCLI console support for HA mode (was: TopCLI console shows exceptions for help command) > TopCLI console support for HA mode > -- > > Key: YARN-4232 > URL: https://issues.apache.org/jira/browse/YARN-4232 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Minor > Attachments: 0001-YARN-4232.patch > > > *Steps to reproduce* > Start Top command in YARN in HA mode > ./yarn top > {noformat} > usage: yarn top > -cols Number of columns on the terminal > -delay The refresh delay(in seconds), default is 3 seconds > -help Print usage; for help while the tool is running press 'h' > + Enter > -queuesComma separated list of queues to restrict applications > -rows Number of rows on the terminal > -types Comma separated list of types to restrict applications, > case sensitive(though the display is lower case) > -users Comma separated list of users to restrict applications > {noformat} > Execute *for help while the tool is running press 'h' + Enter* while top > tool is running > Exception is thrown in console continuously > {noformat} > 15/10/07 14:59:28 ERROR cli.TopCLI: Could not fetch RM start time > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:589) > at java.net.Socket.connect(Socket.java:538) > at sun.net.NetworkClient.doConnect(NetworkClient.java:180) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) > at sun.net.www.http.HttpClient.(HttpClient.java:211) > at sun.net.www.http.HttpClient.New(HttpClient.java:308) > at sun.net.www.http.HttpClient.New(HttpClient.java:326) > at > sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998) > at > sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932) > at > org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:742) > at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:467) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:420) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4232) TopCLI console shows exceptions for help command
[ https://issues.apache.org/jira/browse/YARN-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075005#comment-15075005 ] Bibin A Chundatt commented on YARN-4232: [~vvasudev] Will add support for HA mode. i will update describtion and change to improvement > TopCLI console shows exceptions for help command > - > > Key: YARN-4232 > URL: https://issues.apache.org/jira/browse/YARN-4232 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Minor > Attachments: 0001-YARN-4232.patch > > > *Steps to reproduce* > Start Top command in YARN in HA mode > ./yarn top > {noformat} > usage: yarn top > -cols Number of columns on the terminal > -delay The refresh delay(in seconds), default is 3 seconds > -help Print usage; for help while the tool is running press 'h' > + Enter > -queuesComma separated list of queues to restrict applications > -rows Number of rows on the terminal > -types Comma separated list of types to restrict applications, > case sensitive(though the display is lower case) > -users Comma separated list of users to restrict applications > {noformat} > Execute *for help while the tool is running press 'h' + Enter* while top > tool is running > Exception is thrown in console continuously > {noformat} > 15/10/07 14:59:28 ERROR cli.TopCLI: Could not fetch RM start time > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:589) > at java.net.Socket.connect(Socket.java:538) > at sun.net.NetworkClient.doConnect(NetworkClient.java:180) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) > at sun.net.www.http.HttpClient.(HttpClient.java:211) > at sun.net.www.http.HttpClient.New(HttpClient.java:308) > at sun.net.www.http.HttpClient.New(HttpClient.java:326) > at > sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998) > at > sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932) > at > org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:742) > at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:467) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:420) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4479: Attachment: 0003-YARN-4479.patch Attaching the updated patch, kindly review > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch, > 0003-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074963#comment-15074963 ] Bibin A Chundatt commented on YARN-4510: Attached patch after checkstyle patch > SLS startup failure and webpage broken > -- > > Key: YARN-4510 > URL: https://issues.apache.org/jira/browse/YARN-4510 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4510.patch, 0002-YARN-4510.patch > > > Configure Fair scheduler in yarn site > Start SLS check cluster apps page > {noformat} > 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container > container_1451182067412_0002_01_000258 of capacity on > host a2117.smile.com:2, which has 10 containers, > used and available after allocation > 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) > at > org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > {noformat} > Confgure Capacity scheduler and SLS start up > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SLS failed to start -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074928#comment-15074928 ] Hadoop QA commented on YARN-4510: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s {color} | {color:green} hadoop-sls in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-sls in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12779947/0002-YARN-4510.patch | | JIRA Issue | YARN-4510 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a0935db85b05 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh
[jira] [Updated] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4510: --- Attachment: 0002-YARN-4510.patch > SLS startup failure and webpage broken > -- > > Key: YARN-4510 > URL: https://issues.apache.org/jira/browse/YARN-4510 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4510.patch, 0002-YARN-4510.patch > > > Configure Fair scheduler in yarn site > Start SLS check cluster apps page > {noformat} > 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container > container_1451182067412_0002_01_000258 of capacity on > host a2117.smile.com:2, which has 10 containers, > used and available after allocation > 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) > at > org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > {noformat} > Confgure Capacity scheduler and SLS start up > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SLS failed to start -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074894#comment-15074894 ] Hadoop QA commented on YARN-4510: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 9s {color} | {color:red} Patch generated 2 new checkstyle issues in hadoop-tools/hadoop-sls (total was 41, now 43). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s {color} | {color:green} hadoop-sls in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-sls in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 43s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12779938/0001-YARN-4510.patch | | JIRA Issue | YARN-4510 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux bcc86c2b60a3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personali
[jira] [Commented] (YARN-1382) NodeListManager has a memory leak, unusableRMNodesConcurrentSet is never purged
[ https://issues.apache.org/jira/browse/YARN-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074892#comment-15074892 ] Hadoop QA commented on YARN-1382: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 10s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 29s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 137m 5s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12779913/0003-YARN-1382.pa
[jira] [Commented] (YARN-4232) TopCLI console shows exceptions for help command
[ https://issues.apache.org/jira/browse/YARN-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074891#comment-15074891 ] Varun Vasudev commented on YARN-4232: - Adding support for HA is probably the right thing to do. Good catch [~djp] > TopCLI console shows exceptions for help command > - > > Key: YARN-4232 > URL: https://issues.apache.org/jira/browse/YARN-4232 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Minor > Attachments: 0001-YARN-4232.patch > > > *Steps to reproduce* > Start Top command in YARN in HA mode > ./yarn top > {noformat} > usage: yarn top > -cols Number of columns on the terminal > -delay The refresh delay(in seconds), default is 3 seconds > -help Print usage; for help while the tool is running press 'h' > + Enter > -queuesComma separated list of queues to restrict applications > -rows Number of rows on the terminal > -types Comma separated list of types to restrict applications, > case sensitive(though the display is lower case) > -users Comma separated list of users to restrict applications > {noformat} > Execute *for help while the tool is running press 'h' + Enter* while top > tool is running > Exception is thrown in console continuously > {noformat} > 15/10/07 14:59:28 ERROR cli.TopCLI: Could not fetch RM start time > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:589) > at java.net.Socket.connect(Socket.java:538) > at sun.net.NetworkClient.doConnect(NetworkClient.java:180) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) > at sun.net.www.http.HttpClient.(HttpClient.java:211) > at sun.net.www.http.HttpClient.New(HttpClient.java:308) > at sun.net.www.http.HttpClient.New(HttpClient.java:326) > at > sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998) > at > sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932) > at > org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:742) > at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:467) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:420) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4510: --- Attachment: 0001-YARN-4510.patch Attaching patch for fixing start up failure of SLS and application Page. Startup issue looks related to YARN-1651 . > SLS startup failure and webpage broken > -- > > Key: YARN-4510 > URL: https://issues.apache.org/jira/browse/YARN-4510 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4510.patch > > > Configure Fair scheduler in yarn site > Start SLS check cluster apps page > {noformat} > 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container > container_1451182067412_0002_01_000258 of capacity on > host a2117.smile.com:2, which has 10 containers, > used and available after allocation > 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) > at > org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) > at > org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) > at org.apache.hadoop.yarn.webapp.View.render(View.java:235) > at > org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) > at > org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) > at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) > at > org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) > at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) > at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > {noformat} > Confgure Capacity scheduler and SLS start up > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) > at > org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > {noformat} > SLS failed to start -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074819#comment-15074819 ] Hadoop QA commented on YARN-3933: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 52s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 46s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 17s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 144m 34s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12779909/YARN-3933.002.patch | | JIRA Issue | YARN-3933 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs
[jira] [Updated] (YARN-4510) SLS startup failure and webpage broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4510: --- Priority: Critical (was: Major) Description: Configure Fair scheduler in yarn site Start SLS check cluster apps page {noformat} 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container container_1451182067412_0002_01_000258 of capacity on host a2117.smile.com:2, which has 10 containers, used and available after allocation 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) at org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) {noformat} Confgure Capacity scheduler and SLS start up {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1038) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1319) at org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:252) at org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:82) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) at java.lang.Thread.run(Thread.java:745) {noformat} SLS failed to start was: Configure Start SLS check cluster apps page {noformat} 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container container_1451182067412_0002_01_000258 of capacity on host a2117.smile.com:2, which has 10 containers, used and available after allocation 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) at org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithM
[jira] [Updated] (YARN-4510) SLS clusterapps and scheduler page broken
[ https://issues.apache.org/jira/browse/YARN-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4510: --- Description: Configure Start SLS check cluster apps page {noformat} 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container container_1451182067412_0002_01_000258 of capacity on host a2117.smile.com:2, which has 10 containers, used and available after allocation 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) at org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) {noformat} was: Start SLS check cluster apps page {noformat} 15/12/27 07:40:22 INFO scheduler.SchedulerNode: Assigned container container_1451182067412_0002_01_000258 of capacity on host a2117.smile.com:2, which has 10 containers, used and available after allocation 15/12/27 07:40:22 ERROR webapp.Dispatcher: error handling URI: /cluster/apps java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:299) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppAttemptBlock.getBlacklistedNodes(RMAppAttemptBlock.java:260) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMAppsBlock.renderData(RMAppsBlock.java:99) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:140) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlBlock$Block.subView(HtmlBlock.java:43) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet._(Hamlet.java:30347) at org.apache.hadoop.yarn.server.resourcemanager.webapp.AppsBlockWithMetrics.render(AppsBlockWithMetrics.java:30) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) at org.apache.hadoop.yarn.webapp.Dispatcher.render(Dispatcher.java:197) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:156)
[jira] [Commented] (YARN-4526) Make SystemClock singleton so AppSchedulingInfo could use it
[ https://issues.apache.org/jira/browse/YARN-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074790#comment-15074790 ] Hadoop QA commented on YARN-4526: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 25 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 22s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 54s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 10s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 1s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 1s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s {color} | {color:red} Patch generated 2 new checkstyle issues in root (total was 578, now 580). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 45s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 20s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 1s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 38s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 42s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 20s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | |
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074779#comment-15074779 ] Rohith Sharma K S commented on YARN-4479: - I think we can just relay on isAppRecovering flag which should be sufficient. And existing code in RMAppAttemptImpl can be there as-it-is(without patch). Only FAILED attempts are added to scheduler which will be removed in next event itself. > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch, 0002-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)