[jira] [Commented] (YARN-1986) After upgrade from 2.2.0 to 2.4.0, NPE on first job start.
[ https://issues.apache.org/jira/browse/YARN-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992511#comment-13992511 ] Hadoop QA commented on YARN-1986: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643895/YARN-1986-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3716//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3716//console This message is automatically generated. After upgrade from 2.2.0 to 2.4.0, NPE on first job start. -- Key: YARN-1986 URL: https://issues.apache.org/jira/browse/YARN-1986 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Jon Bringhurst Assignee: Hong Zhiguo Attachments: YARN-1986-2.patch, YARN-1986-3.patch, YARN-1986-testcase.patch, YARN-1986.patch After upgrade from 2.2.0 to 2.4.0, NPE on first job start. After RM was restarted, the job runs without a problem. {noformat} 19:11:13,441 FATAL ResourceManager:600 - Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:462) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:714) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:743) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:104) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) at java.lang.Thread.run(Thread.java:744) 19:11:13,443 INFO ResourceManager:604 - Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2018) TestClientRMService.testTokenRenewalWrongUser fails after HADOOP-10562
[ https://issues.apache.org/jira/browse/YARN-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992797#comment-13992797 ] Hudson commented on YARN-2018: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1751 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1751/]) YARN-2018. TestClientRMService.testTokenRenewalWrongUser fails after HADOOP-10562. (Contributed by Ming Ma) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1592783) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java TestClientRMService.testTokenRenewalWrongUser fails after HADOOP-10562 Key: YARN-2018 URL: https://issues.apache.org/jira/browse/YARN-2018 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.5.0 Reporter: Tsuyoshi OZAWA Assignee: Ming Ma Attachments: YARN-2018.patch The test failure is observed on YARN-1945 and YARN-1861. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied
[ https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993513#comment-13993513 ] Hadoop QA commented on YARN-1813: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642758/YARN-1813.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3726//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3726//console This message is automatically generated. Better error message for yarn logs when permission denied --- Key: YARN-1813 URL: https://issues.apache.org/jira/browse/YARN-1813 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Andrew Wang Assignee: Tsuyoshi OZAWA Priority: Minor Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, YARN-1813.3.patch I ran some MR jobs as the hdfs user, and then forgot to sudo -u when grabbing the logs. yarn logs prints an error message like the following: {noformat} [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at a2402.halxg.cloudera.com/10.20.212.10:8032 Logs not available at /tmp/logs/andrew.wang/logs/application_1394482121761_0010 Log aggregation has not completed or is not enabled. {noformat} It'd be nicer if it said Permission denied or AccessControlException or something like that instead, since that's the real issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied
[ https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993503#comment-13993503 ] Junping Du commented on YARN-1813: -- Kick off Jenkins test manually. Better error message for yarn logs when permission denied --- Key: YARN-1813 URL: https://issues.apache.org/jira/browse/YARN-1813 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Andrew Wang Assignee: Tsuyoshi OZAWA Priority: Minor Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, YARN-1813.3.patch I ran some MR jobs as the hdfs user, and then forgot to sudo -u when grabbing the logs. yarn logs prints an error message like the following: {noformat} [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at a2402.halxg.cloudera.com/10.20.212.10:8032 Logs not available at /tmp/logs/andrew.wang/logs/application_1394482121761_0010 Log aggregation has not completed or is not enabled. {noformat} It'd be nicer if it said Permission denied or AccessControlException or something like that instead, since that's the real issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-666) [Umbrella] Support rolling upgrades in YARN
[ https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993506#comment-13993506 ] Junping Du commented on YARN-666: - Link to two related JIRAs - work preserving during RM and NM restart. [Umbrella] Support rolling upgrades in YARN --- Key: YARN-666 URL: https://issues.apache.org/jira/browse/YARN-666 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Siddharth Seth Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf Jira to track changes required in YARN to allow rolling upgrades, including documentation and possible upgrade routes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1701) Improve default paths of timeline store and generic history store
[ https://issues.apache.org/jira/browse/YARN-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993435#comment-13993435 ] Hudson commented on YARN-1701: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5601 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5601/]) YARN-1701. Improved default paths of the timeline store and the generic history store. Contributed by Tsuyoshi Ozawa. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593481) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Improve default paths of timeline store and generic history store - Key: YARN-1701 URL: https://issues.apache.org/jira/browse/YARN-1701 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Tsuyoshi OZAWA Fix For: 2.4.1 Attachments: YARN-1701.3.patch, YARN-1701.v01.patch, YARN-1701.v02.patch When I enable AHS via yarn.ahs.enabled, the app history is still not visible in AHS webUI. This is due to NullApplicationHistoryStore as yarn.resourcemanager.history-writer.class. It would be good to have just one key to enable basic functionality. yarn.ahs.fs-history-store.uri uses {code}${hadoop.log.dir}{code}, which is local file system location. However, FileSystemApplicationHistoryStore uses DFS by default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1986) After upgrade from 2.2.0 to 2.4.0, NPE on first job start.
[ https://issues.apache.org/jira/browse/YARN-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1986: -- Priority: Critical (was: Major) Target Version/s: 2.5.0 After upgrade from 2.2.0 to 2.4.0, NPE on first job start. -- Key: YARN-1986 URL: https://issues.apache.org/jira/browse/YARN-1986 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Jon Bringhurst Assignee: Hong Zhiguo Priority: Critical Attachments: YARN-1986-2.patch, YARN-1986-3.patch, YARN-1986-testcase.patch, YARN-1986.patch After upgrade from 2.2.0 to 2.4.0, NPE on first job start. -After RM was restarted, the job runs without a problem.- {noformat} 19:11:13,441 FATAL ResourceManager:600 - Error in handling event type NODE_UPDATE to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.assignContainers(FifoScheduler.java:462) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.nodeUpdate(FifoScheduler.java:714) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:743) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.handle(FifoScheduler.java:104) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) at java.lang.Thread.run(Thread.java:744) 19:11:13,443 INFO ResourceManager:604 - Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1982) Rename the daemon name to timelineserver
[ https://issues.apache.org/jira/browse/YARN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1982: -- Issue Type: Sub-task (was: Bug) Parent: YARN-1530 Rename the daemon name to timelineserver Key: YARN-1982 URL: https://issues.apache.org/jira/browse/YARN-1982 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 3.0.0, 2.4.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: cli Attachments: YARN-1982.1.patch Nowadays, it's confusing that we call the new component timeline server, but we use {code} yarn historyserver yarn-daemon.sh start historyserver {code} to start the daemon. Before the confusion keeps being propagated, we'd better to modify command line asap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1982) Rename the daemon name to timelineserver
[ https://issues.apache.org/jira/browse/YARN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994333#comment-13994333 ] Vinod Kumar Vavilapalli commented on YARN-1982: --- Okay, looks fine to me too. Checking this in. Rename the daemon name to timelineserver Key: YARN-1982 URL: https://issues.apache.org/jira/browse/YARN-1982 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 3.0.0, 2.4.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Labels: cli Attachments: YARN-1982.1.patch Nowadays, it's confusing that we call the new component timeline server, but we use {code} yarn historyserver yarn-daemon.sh start historyserver {code} to start the daemon. Before the confusion keeps being propagated, we'd better to modify command line asap. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1868) YARN status web ui does not show correctly in IE 11
[ https://issues.apache.org/jira/browse/YARN-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1868: -- Target Version/s: 2.5.0 YARN status web ui does not show correctly in IE 11 --- Key: YARN-1868 URL: https://issues.apache.org/jira/browse/YARN-1868 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Attachments: YARN-1868.1.patch, YARN-1868.2.patch, YARN-1868.patch, YARN_status.png The YARN status web ui does not show correctly in IE 11. The drop down menu for app entries are not shown. Also the navigation menu displays incorrectly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2043) Rename internal names to being Timeline Service instead of application history
Vinod Kumar Vavilapalli created YARN-2043: - Summary: Rename internal names to being Timeline Service instead of application history Key: YARN-2043 URL: https://issues.apache.org/jira/browse/YARN-2043 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Like package and class names. In line with YARN-2033, YARN-1982 etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1864) Fair Scheduler Dynamic Hierarchical User Queues
[ https://issues.apache.org/jira/browse/YARN-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992832#comment-13992832 ] Hudson commented on YARN-1864: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1777 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1777/]) YARN-1864. Add missing file FSQueueType.java (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593191) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java YARN-1864. Fair Scheduler Dynamic Hierarchical User Queues (Ashwin Shankar via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593190) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueueManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueuePlacementPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm Fair Scheduler Dynamic Hierarchical User Queues --- Key: YARN-1864 URL: https://issues.apache.org/jira/browse/YARN-1864 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Reporter: Ashwin Shankar Assignee: Ashwin Shankar Labels: scheduler Fix For: 2.5.0 Attachments: YARN-1864-v1.txt, YARN-1864-v2.txt, YARN-1864-v3.txt, YARN-1864-v4.txt, YARN-1864-v5.txt, YARN-1864-v6.txt, YARN-1864-v6.txt In Fair Scheduler, we want to be able to create user queues under any parent queue in the hierarchy. For eg. Say user1 submits a job to a parent queue called root.allUserQueues, we want be able to create a new queue called root.allUserQueues.user1 and run user1's job in it.Any further jobs submitted by this user to root.allUserQueues will be run in this newly created root.allUserQueues.user1. This is very similar to the 'user-as-default' feature in Fair Scheduler which creates user queues under root queue. But we want the ability to create user queues under ANY parent queue. Why do we want this ? 1. Preemption : these dynamically created user queues can preempt each other if its fair share is not met. So there is fairness among users. User queues can also preempt other non-user leaf queue as well if below fair share. 2. Allocation to user queues : we want all the user queries(adhoc) to consume only a fraction of resources in the shared cluster. By creating this feature,we could do that by giving a fair share to the parent user queue which is then redistributed to all the dynamically created user queues. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2011) Typo in TestLeafQueue
[ https://issues.apache.org/jira/browse/YARN-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated YARN-2011: -- Attachment: YARN-2011-v2.patch patch updated. Typo in TestLeafQueue - Key: YARN-2011 URL: https://issues.apache.org/jira/browse/YARN-2011 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0 Reporter: Chen He Assignee: Chen He Priority: Trivial Attachments: YARN-2011-v2.patch, YARN-2011.patch a.assignContainers(clusterResource, node_0); assertEquals(2*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G // Again one to user_0 since he hasn't exceeded user limit yet a.assignContainers(clusterResource, node_0); assertEquals(3*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(1*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1962) Timeline server is enabled by default
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994351#comment-13994351 ] Vinod Kumar Vavilapalli commented on YARN-1962: --- This will have to change again as soon as we want to call Timeline Service stable. But the patch is okay for the short term. I had reviewed all the previous patches for timeline-service and this patch looks fine to me. Checking this in. Timeline server is enabled by default - Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1962.1.patch, YARN-1962.2.patch Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at
[jira] [Commented] (YARN-2010) RM can't transition to active if it can't recover an app attempt
[ https://issues.apache.org/jira/browse/YARN-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993159#comment-13993159 ] Hadoop QA commented on YARN-2010: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643991/yarn-2010-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3722//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3722//console This message is automatically generated. RM can't transition to active if it can't recover an app attempt Key: YARN-2010 URL: https://issues.apache.org/jira/browse/YARN-2010 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: bc Wong Assignee: Rohith Priority: Critical Attachments: YARN-2010.1.patch, YARN-2010.patch, yarn-2010-2.patch If the RM fails to recover an app attempt, it won't come up. We should make it more resilient. Specifically, the underlying error is that the app was submitted before Kerberos security got turned on. Makes sense for the app to fail in this case. But YARN should still start. {noformat} 2014-04-11 11:56:37,216 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:118) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:116) ... 4 more Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.YarnException: java.lang.IllegalArgumentException: Missing argument at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:811) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:265) ... 5 more Caused by: org.apache.hadoop.yarn.exceptions.YarnException: java.lang.IllegalArgumentException: Missing argument at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:372) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:406) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1000) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:462) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) ... 8 more Caused by: java.lang.IllegalArgumentException:
[jira] [Commented] (YARN-2011) Typo in TestLeafQueue
[ https://issues.apache.org/jira/browse/YARN-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994359#comment-13994359 ] Chen He commented on YARN-2011: --- I am happy to. Thank you for the update [~djp]. Typo in TestLeafQueue - Key: YARN-2011 URL: https://issues.apache.org/jira/browse/YARN-2011 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0 Reporter: Chen He Assignee: Chen He Priority: Trivial Attachments: YARN-2011.patch a.assignContainers(clusterResource, node_0); assertEquals(2*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G // Again one to user_0 since he hasn't exceeded user limit yet a.assignContainers(clusterResource, node_0); assertEquals(3*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(1*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2027) YARN ignores host-specific resource requests
[ https://issues.apache.org/jira/browse/YARN-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993156#comment-13993156 ] Chris Riccomini commented on YARN-2027: --- So, running this request with memMb=3584, cpuCores=1, containers=32: {code} protected def requestContainers(memMb: Int, cpuCores: Int, containers: Int) { info(Requesting %d container(s) with %dmb of memory format (containers, memMb)) val capability = Records.newRecord(classOf[Resource]) val priority = Records.newRecord(classOf[Priority]) priority.setPriority(0) capability.setMemory(memMb) capability.setVirtualCores(cpuCores) def getHosts = { val hosts = getNextRoundRobinHosts System.err.println(hosts.toList) hosts } (0 until containers).foreach(idx = amClient.addContainerRequest(new ContainerRequest(capability, getHosts, List(/default-rack).toArray[String], priority, false))) } {code} Prints this in the AM logs: {noformat} List(eat1-app857, eat1-app873, eat1-app880) List(eat1-app854, eat1-app864, eat1-app872) List(eat1-app852, eat1-app873, eat1-app880) List(eat1-app854, eat1-app880, eat1-app867) List(eat1-app875, eat1-app852, eat1-app873) List(eat1-app875, eat1-app852, eat1-app872) List(eat1-app873, eat1-app859, eat1-app880) List(eat1-app854, eat1-app873, eat1-app864) List(eat1-app852, eat1-app874, eat1-app875) List(eat1-app864, eat1-app859, eat1-app880) List(eat1-app874, eat1-app872, eat1-app875) List(eat1-app874, eat1-app873, eat1-app864) List(eat1-app873, eat1-app859, eat1-app858) List(eat1-app874, eat1-app873, eat1-app854) List(eat1-app867, eat1-app880, eat1-app872) List(eat1-app859, eat1-app875, eat1-app880) List(eat1-app875, eat1-app872, eat1-app864) List(eat1-app875, eat1-app867, eat1-app852) List(eat1-app857, eat1-app852, eat1-app867) List(eat1-app872, eat1-app854, eat1-app858) List(eat1-app852, eat1-app872, eat1-app858) List(eat1-app880, eat1-app873, eat1-app857) List(eat1-app859, eat1-app871, eat1-app874) List(eat1-app880, eat1-app874, eat1-app865) List(eat1-app867, eat1-app873, eat1-app875) List(eat1-app857, eat1-app858, eat1-app852) List(eat1-app857, eat1-app867, eat1-app873) List(eat1-app857, eat1-app871, eat1-app854) List(eat1-app874, eat1-app865, eat1-app873) List(eat1-app852, eat1-app880, eat1-app858) List(eat1-app875, eat1-app873, eat1-app871) List(eat1-app854, eat1-app880, eat1-app865) {noformat} With DEBUG logging in the RM logs (with no other job on the grid), I see: {noformat} 21:18:02,958 DEBUG AppSchedulingInfo:135 - update: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 32, Location: *, Relax Locality: false} 21:18:02,958 DEBUG ActiveUsersManager:68 - User my-job-name added to activeUsers, currently: 1 21:18:02,959 DEBUG CapacityScheduler:704 - allocate: post-update 21:18:02,959 DEBUG SchedulerApplicationAttempt:328 - showRequests: application=application_1399581102453_0003 headRoom=memory:736256, vCores:0 currentConsumption=1024 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 9, Location: eat1-app875, Relax Locality: true} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 6, Location: eat1-app857, Relax Locality: true} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 11, Location: eat1-app880, Relax Locality: true} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 7, Location: eat1-app854, Relax Locality: true} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 32, Location: /default-rack, Relax Locality: true} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 5, Location: eat1-app858, Relax Locality: true} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 32, Location: *, Relax Locality: false} 21:18:02,959 DEBUG SchedulerApplicationAttempt:332 - showRequests: application=application_1399581102453_0003 request={Priority: 0, Capability: memory:3584, vCores:1, # Containers: 7, Location: eat1-app874, Relax Locality: true} 21:18:02,959 DEBUG
[jira] [Commented] (YARN-1918) Typo in description and error message for 'yarn.resourcemanager.cluster-id'
[ https://issues.apache.org/jira/browse/YARN-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992712#comment-13992712 ] Hadoop QA commented on YARN-1918: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643921/YARN-1918.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3720//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3720//console This message is automatically generated. Typo in description and error message for 'yarn.resourcemanager.cluster-id' --- Key: YARN-1918 URL: https://issues.apache.org/jira/browse/YARN-1918 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Devaraj K Assignee: Anandha L Ranganathan Priority: Trivial Labels: newbie Attachments: YARN-1918.1.patch 1. In yarn-default.xml {code:xml} property descriptionName of the cluster. In a HA setting, this is used to ensure the RM participates in leader election fo this cluster and ensures it does not affect other clusters/description nameyarn.resourcemanager.cluster-id/name !--valueyarn-cluster/value-- /property {code} Here the line 'election fo this cluster and ensures it does not affect' should be replaced with 'election for this cluster and ensures it does not affect'. 2. {code:xml} org.apache.hadoop.HadoopIllegalArgumentException: Configuration doesn't specifyyarn.resourcemanager.cluster-id at org.apache.hadoop.yarn.conf.YarnConfiguration.getClusterId(YarnConfiguration.java:1336) {code} In the above exception message, it is missing a space between message and configuration name. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs
[ https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992762#comment-13992762 ] Steve Loughran commented on YARN-2031: -- We are currently working around this by not redirecting the portion of our web site under the path {{/ws}} [SliderAmIpFilter|https://svn.apache.org/repos/asf/incubator/slider/trunk/slider-core/src/main/java/org/apache/slider/server/appmaster/web/SliderAmIpFilter.java] Other AMs will hit the same problem, and we should have a standard fix YARN Proxy model doesn't support REST APIs in AMs - Key: YARN-2031 URL: https://issues.apache.org/jira/browse/YARN-2031 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Steve Loughran AMs can't support REST APIs because # the AM filter redirects all requests to the proxy with a 302 response (not 307) # the proxy doesn't forward PUT/POST/DELETE verbs Either the AM filter needs to return 307 and the proxy to forward the verbs, or Am filter should not filter a REST bit of the web site -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-766) TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk
[ https://issues.apache.org/jira/browse/YARN-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994372#comment-13994372 ] Hudson commented on YARN-766: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1778 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1778/]) YARN-766. TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk. (Contributed by Siddharth Seth) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593660) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk Key: YARN-766 URL: https://issues.apache.org/jira/browse/YARN-766 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.1.0-beta Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Minor Attachments: YARN-766.branch-2.txt, YARN-766.trunk.txt, YARN-766.txt File scriptFile = new File(tmpDir, scriptFile.sh); should be replaced with File scriptFile = Shell.appendScriptExtension(tmpDir, scriptFile); to match trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1515) Ability to dump the container threads and stop the containers in a single RPC
[ https://issues.apache.org/jira/browse/YARN-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated YARN-1515: Attachment: YARN-1515.v07.patch v07 addressing Jason's review. Thanks! Ability to dump the container threads and stop the containers in a single RPC - Key: YARN-1515 URL: https://issues.apache.org/jira/browse/YARN-1515 Project: Hadoop YARN Issue Type: New Feature Components: api, nodemanager Reporter: Gera Shegalov Assignee: Gera Shegalov Attachments: YARN-1515.v01.patch, YARN-1515.v02.patch, YARN-1515.v03.patch, YARN-1515.v04.patch, YARN-1515.v05.patch, YARN-1515.v06.patch, YARN-1515.v07.patch This is needed to implement MAPREDUCE-5044 to enable thread diagnostics for timed-out task attempts. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2011) Typo in TestLeafQueue
[ https://issues.apache.org/jira/browse/YARN-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994382#comment-13994382 ] Junping Du commented on YARN-2011: -- Patch looks good to me. +1 pending on Jenkins result. Typo in TestLeafQueue - Key: YARN-2011 URL: https://issues.apache.org/jira/browse/YARN-2011 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0 Reporter: Chen He Assignee: Chen He Priority: Trivial Attachments: YARN-2011-v2.patch, YARN-2011.patch a.assignContainers(clusterResource, node_0); assertEquals(2*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G // Again one to user_0 since he hasn't exceeded user limit yet a.assignContainers(clusterResource, node_0); assertEquals(3*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(1*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1861) Both RM stuck in standby mode when automatic failover is enabled
[ https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-1861: - Assignee: Karthik Kambatla (was: Xuan Gong) Looking at this as it is blocking 2.4.1. Assigning to Karthik given he has done the core code change. Will credit Xuan too. I tried to just apply the test-case and run it without the core change and was expecting the active RM to go to standby and the standby RM to go to active once the originally active RM is fenced. Instead I get a NPE somewhere. Can the test be fixed to do so? Also, we need to make sure that when automatic failover is enabled, all external interventions like a fence like this bug (and forced-manual failover from CLI?) do a similar reset into the leader election. There may not be cases like this today though.. Both RM stuck in standby mode when automatic failover is enabled Key: YARN-1861 URL: https://issues.apache.org/jira/browse/YARN-1861 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Karthik Kambatla Priority: Blocker Attachments: YARN-1861.2.patch, YARN-1861.3.patch, YARN-1861.4.patch, YARN-1861.5.patch, yarn-1861-1.patch, yarn-1861-6.patch In our HA tests we noticed that the tests got stuck because both RM's got into standby state and no one became active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-766) TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk
[ https://issues.apache.org/jira/browse/YARN-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994386#comment-13994386 ] Hudson commented on YARN-766: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1752 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1752/]) YARN-766. TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk. (Contributed by Siddharth Seth) (junping_du: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593660) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java TestNodeManagerShutdown in branch-2 should use Shell to form the output path and a format issue in trunk Key: YARN-766 URL: https://issues.apache.org/jira/browse/YARN-766 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.1.0-beta Reporter: Siddharth Seth Assignee: Siddharth Seth Priority: Minor Attachments: YARN-766.branch-2.txt, YARN-766.trunk.txt, YARN-766.txt File scriptFile = new File(tmpDir, scriptFile.sh); should be replaced with File scriptFile = Shell.appendScriptExtension(tmpDir, scriptFile); to match trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2036) Document yarn.resourcemanager.hostname in ClusterSetup
[ https://issues.apache.org/jira/browse/YARN-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994376#comment-13994376 ] Hudson commented on YARN-2036: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1778 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1778/]) YARN-2036. Document yarn.resourcemanager.hostname in ClusterSetup (Ray Chiang via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593631) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/ClusterSetup.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt Document yarn.resourcemanager.hostname in ClusterSetup -- Key: YARN-2036 URL: https://issues.apache.org/jira/browse/YARN-2036 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Ray Chiang Priority: Minor Fix For: 2.5.0 Attachments: YARN2036-01.patch, YARN2036-02.patch ClusterSetup doesn't talk about yarn.resourcemanager.hostname - most people should just be able to use that directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1864) Fair Scheduler Dynamic Hierarchical User Queues
[ https://issues.apache.org/jira/browse/YARN-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994347#comment-13994347 ] Hudson commented on YARN-1864: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #560 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/560/]) YARN-1864. Add missing file FSQueueType.java (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593191) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueType.java YARN-1864. Fair Scheduler Dynamic Hierarchical User Queues (Ashwin Shankar via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593190) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueueManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueuePlacementPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm Fair Scheduler Dynamic Hierarchical User Queues --- Key: YARN-1864 URL: https://issues.apache.org/jira/browse/YARN-1864 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Reporter: Ashwin Shankar Assignee: Ashwin Shankar Labels: scheduler Fix For: 2.5.0 Attachments: YARN-1864-v1.txt, YARN-1864-v2.txt, YARN-1864-v3.txt, YARN-1864-v4.txt, YARN-1864-v5.txt, YARN-1864-v6.txt, YARN-1864-v6.txt In Fair Scheduler, we want to be able to create user queues under any parent queue in the hierarchy. For eg. Say user1 submits a job to a parent queue called root.allUserQueues, we want be able to create a new queue called root.allUserQueues.user1 and run user1's job in it.Any further jobs submitted by this user to root.allUserQueues will be run in this newly created root.allUserQueues.user1. This is very similar to the 'user-as-default' feature in Fair Scheduler which creates user queues under root queue. But we want the ability to create user queues under ANY parent queue. Why do we want this ? 1. Preemption : these dynamically created user queues can preempt each other if its fair share is not met. So there is fairness among users. User queues can also preempt other non-user leaf queue as well if below fair share. 2. Allocation to user queues : we want all the user queries(adhoc) to consume only a fraction of resources in the shared cluster. By creating this feature,we could do that by giving a fair share to the parent user queue which is then redistributed to all the dynamically created user queues. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1702) Expose kill app functionality as part of RM web services
[ https://issues.apache.org/jira/browse/YARN-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1702: Attachment: apache-yarn-1702.10.patch Patch with release audit warning fixed. Expose kill app functionality as part of RM web services Key: YARN-1702 URL: https://issues.apache.org/jira/browse/YARN-1702 Project: Hadoop YARN Issue Type: Sub-task Reporter: Varun Vasudev Assignee: Varun Vasudev Attachments: apache-yarn-1702.10.patch, apache-yarn-1702.2.patch, apache-yarn-1702.3.patch, apache-yarn-1702.4.patch, apache-yarn-1702.5.patch, apache-yarn-1702.7.patch, apache-yarn-1702.8.patch, apache-yarn-1702.9.patch Expose functionality to kill an app via the ResourceManager web services API. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1976) Tracking url missing http protocol for FAILED application
[ https://issues.apache.org/jira/browse/YARN-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1976: - Target Version/s: 2.5.0, 2.4.1 (was: 2.4.1) Tracking url missing http protocol for FAILED application - Key: YARN-1976 URL: https://issues.apache.org/jira/browse/YARN-1976 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora Assignee: Junping Du Attachments: YARN-1976-v2.patch, YARN-1976.patch Run yarn application -list -appStates FAILED, It does not print http protocol name like FINISHED apps. {noformat} -bash-4.1$ yarn application -list -appStates FINISHED,FAILED,KILLED 14/04/15 23:55:07 INFO client.RMProxy: Connecting to ResourceManager at host Total number of applications (application-types: [] and states: [FINISHED, FAILED, KILLED]):4 Application-IdApplication-Name Application-Type User Queue State Final-State ProgressTracking-URL application_1397598467870_0004 Sleep job MAPREDUCEhrt_qa defaultFINISHED SUCCEEDED 100% http://host:19888/jobhistory/job/job_1397598467870_0004 application_1397598467870_0003 Sleep job MAPREDUCEhrt_qa defaultFINISHED SUCCEEDED 100% http://host:19888/jobhistory/job/job_1397598467870_0003 application_1397598467870_0002 Sleep job MAPREDUCEhrt_qa default FAILED FAILED 100% host:8088/cluster/app/application_1397598467870_0002 application_1397598467870_0001 word count MAPREDUCEhrt_qa defaultFINISHED SUCCEEDED 100% http://host:19888/jobhistory/job/job_1397598467870_0001 {noformat} It only prints 'host:8088/cluster/app/application_1397598467870_0002' instead 'http://host:8088/cluster/app/application_1397598467870_0002' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1864) Fair Scheduler Dynamic Hierarchical User Queues
[ https://issues.apache.org/jira/browse/YARN-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1864: - Assignee: Ashwin Shankar Fair Scheduler Dynamic Hierarchical User Queues --- Key: YARN-1864 URL: https://issues.apache.org/jira/browse/YARN-1864 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Reporter: Ashwin Shankar Assignee: Ashwin Shankar Labels: scheduler Fix For: 2.5.0 Attachments: YARN-1864-v1.txt, YARN-1864-v2.txt, YARN-1864-v3.txt, YARN-1864-v4.txt, YARN-1864-v5.txt, YARN-1864-v6.txt, YARN-1864-v6.txt In Fair Scheduler, we want to be able to create user queues under any parent queue in the hierarchy. For eg. Say user1 submits a job to a parent queue called root.allUserQueues, we want be able to create a new queue called root.allUserQueues.user1 and run user1's job in it.Any further jobs submitted by this user to root.allUserQueues will be run in this newly created root.allUserQueues.user1. This is very similar to the 'user-as-default' feature in Fair Scheduler which creates user queues under root queue. But we want the ability to create user queues under ANY parent queue. Why do we want this ? 1. Preemption : these dynamically created user queues can preempt each other if its fair share is not met. So there is fairness among users. User queues can also preempt other non-user leaf queue as well if below fair share. 2. Allocation to user queues : we want all the user queries(adhoc) to consume only a fraction of resources in the shared cluster. By creating this feature,we could do that by giving a fair share to the parent user queue which is then redistributed to all the dynamically created user queues. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2010) RM can't transition to active if it can't recover an app attempt
[ https://issues.apache.org/jira/browse/YARN-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994143#comment-13994143 ] Jian He commented on YARN-2010: --- Hi folks, thanks for working on this. I agree that we should not fail RM if app failed to recover.YARN-2019 seems taking care of this. But in this particular case, IIUC, the problem is that RM was running in non-secure mode and so clientTokenMaterKey is null. After RM restarts, RM starts running in secure mode and expects clientTokenMaterKey non-null and then fails. In non-workpreserving restart, since the old attempt will be essentially killed on RM restart, new attempt will be automatically started and it will have the new clientTokenMaterKey key generated. So we may not need to fail this app. In work-preserving restart, because the old AM running before RM restart(non-secure) was not given the clientToAMMasterKey, even though RM is now running in secure mode, client without the clientToken should also be able to talk with the AM? [~vinodkv] is this the case? RM can't transition to active if it can't recover an app attempt Key: YARN-2010 URL: https://issues.apache.org/jira/browse/YARN-2010 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: bc Wong Assignee: Rohith Priority: Critical Attachments: YARN-2010.1.patch, YARN-2010.patch, yarn-2010-2.patch If the RM fails to recover an app attempt, it won't come up. We should make it more resilient. Specifically, the underlying error is that the app was submitted before Kerberos security got turned on. Makes sense for the app to fail in this case. But YARN should still start. {noformat} 2014-04-11 11:56:37,216 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:118) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804) at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:274) at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:116) ... 4 more Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.yarn.exceptions.YarnException: java.lang.IllegalArgumentException: Missing argument at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:811) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:842) at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:265) ... 5 more Caused by: org.apache.hadoop.yarn.exceptions.YarnException: java.lang.IllegalArgumentException: Missing argument at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:372) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:273) at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:406) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1000) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:462) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) ... 8 more Caused by: java.lang.IllegalArgumentException: Missing argument at javax.crypto.spec.SecretKeySpec.init(SecretKeySpec.java:93) at org.apache.hadoop.security.token.SecretManager.createSecretKey(SecretManager.java:188) at org.apache.hadoop.yarn.server.resourcemanager.security.ClientToAMTokenSecretManagerInRM.registerMasterKey(ClientToAMTokenSecretManagerInRM.java:49) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.recoverAppAttemptCredentials(RMAppAttemptImpl.java:711) at
[jira] [Commented] (YARN-2011) Typo in TestLeafQueue
[ https://issues.apache.org/jira/browse/YARN-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994395#comment-13994395 ] Junping Du commented on YARN-2011: -- Looks like Jenkins test is not started automatically. Kick off Jenkins manually. Hi [~vinodkv], I saw many times recently that Jenkins didn't start automatically when updating a patch (i.e. YARN-1813). Can you take a look at what's wrong there? Thx! Typo in TestLeafQueue - Key: YARN-2011 URL: https://issues.apache.org/jira/browse/YARN-2011 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0 Reporter: Chen He Assignee: Chen He Priority: Trivial Attachments: YARN-2011-v2.patch, YARN-2011.patch a.assignContainers(clusterResource, node_0); assertEquals(2*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G // Again one to user_0 since he hasn't exceeded user limit yet a.assignContainers(clusterResource, node_0); assertEquals(3*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(1*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2011) Typo in TestLeafQueue
[ https://issues.apache.org/jira/browse/YARN-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994403#comment-13994403 ] Hadoop QA commented on YARN-2011: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12644309/YARN-2011-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3731//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3731//console This message is automatically generated. Typo in TestLeafQueue - Key: YARN-2011 URL: https://issues.apache.org/jira/browse/YARN-2011 Project: Hadoop YARN Issue Type: Test Affects Versions: 2.4.0 Reporter: Chen He Assignee: Chen He Priority: Trivial Attachments: YARN-2011-v2.patch, YARN-2011.patch a.assignContainers(clusterResource, node_0); assertEquals(2*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // User limit = 2G // Again one to user_0 since he hasn't exceeded user limit yet a.assignContainers(clusterResource, node_0); assertEquals(3*GB, a.getUsedResources().getMemory()); assertEquals(2*GB, app_0.getCurrentConsumption().getMemory()); assertEquals(1*GB, app_1.getCurrentConsumption().getMemory()); assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G assertEquals(0*GB, app_0.getHeadroom().getMemory()); // 3G - 2G -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied
[ https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994432#comment-13994432 ] Tsuyoshi OZAWA commented on YARN-1813: -- Thank you, Junping. Better error message for yarn logs when permission denied --- Key: YARN-1813 URL: https://issues.apache.org/jira/browse/YARN-1813 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.3.0 Reporter: Andrew Wang Assignee: Tsuyoshi OZAWA Priority: Minor Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, YARN-1813.3.patch I ran some MR jobs as the hdfs user, and then forgot to sudo -u when grabbing the logs. yarn logs prints an error message like the following: {noformat} [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at a2402.halxg.cloudera.com/10.20.212.10:8032 Logs not available at /tmp/logs/andrew.wang/logs/application_1394482121761_0010 Log aggregation has not completed or is not enabled. {noformat} It'd be nicer if it said Permission denied or AccessControlException or something like that instead, since that's the real issue. -- This message was sent by Atlassian JIRA (v6.2#6252)