[jira] [Commented] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922006#comment-13922006 ] shenhong commented on YARN-1786: Thanks Jian He. The code I used is from http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.3.0/. But the trunk also have this bug. > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong >Assignee: shenhong > Attachments: YARN-1786.patch > > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920624#comment-13920624 ] shenhong commented on YARN-1786: {code} @Test public void testAppAcceptedKill() throws IOException, InterruptedException { LOG.info("--- START: testAppAcceptedKill ---"); RMApp application = testCreateAppAccepted(null); // ACCEPTED => KILLED event RMAppEventType.KILL RMAppEvent event = new RMAppEvent(application.getApplicationId(), RMAppEventType.KILL); application.handle(event); rmDispatcher.await(); assertAppAndAttemptKilled(application); } {code} at assertAppAndAttemptKilled, it will call sendAttemptUpdateSavedEvent. {code} private void assertAppAndAttemptKilled(RMApp application) throws InterruptedException { sendAttemptUpdateSavedEvent(application); sendAppUpdateSavedEvent(application); assertKilled(application); Assert.assertEquals(RMAppAttemptState.KILLED, application .getCurrentAppAttempt().getAppAttemptState()); assertAppFinalStateSaved(application); } {code} > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong >Assignee: shenhong > Attachments: YARN-1786.patch > > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Attachment: YARN-1786.patch > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong >Assignee: shenhong > Attachments: YARN-1786.patch > > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Attachment: (was: YARN-1786.patch) > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong >Assignee: shenhong > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Description: TestRMAppTransitions often fail with "application finish time is not greater then 0", following is log: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} was: TestRMAppTransitions often fail with "application finish time is not greater then 0", following is log: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong >Assignee: shenhong > Attachments: YARN-1786.patch > > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! juni
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Attachment: YARN-1786.patch Add a patch to fix the bug! > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong > Attachments: YARN-1786.patch > > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong reassigned YARN-1786: -- Assignee: shenhong > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong >Assignee: shenhong > Attachments: YARN-1786.patch > > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920410#comment-13920410 ] shenhong commented on YARN-1786: Here is the code: {code} private void sendAppUpdateSavedEvent(RMApp application) { RMAppEvent event = new RMAppUpdateSavedEvent(application.getApplicationId(), null); application.handle(event); rmDispatcher.await(); } private void sendAttemptUpdateSavedEvent(RMApp application) { application.getCurrentAppAttempt().handle( new RMAppAttemptUpdateSavedEvent(application.getCurrentAppAttempt() .getAppAttemptId(), null)); } {code} At sendAttemptUpdateSavedEvent(), there is no rmDispatcher.await() after handle an event, it should be changed to {code} private void sendAttemptUpdateSavedEvent(RMApp application) { application.getCurrentAppAttempt().handle( new RMAppAttemptUpdateSavedEvent(application.getCurrentAppAttempt() .getAppAttemptId(), null)); rmDispatcher.await(); } {code} > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Description: TestRMAppTransitions often fail with "application finish time is not greater then 0", following is log: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} was: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong > > TestRMAppTransitions often fail with "application finish time is not greater > then 0", following is log: > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFin
[jira] [Created] (YARN-1786) TestRMAppTransitions occasionally fail
shenhong created YARN-1786: -- Summary: TestRMAppTransitions occasionally fail Key: YARN-1786 URL: https://issues.apache.org/jira/browse/YARN-1786 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: shenhong -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Description: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} was: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong > > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager
[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail
[ https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1786: --- Description: {code} testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: application finish time is not greater then 0 at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) {code} > TestRMAppTransitions occasionally fail > -- > > Key: YARN-1786 > URL: https://issues.apache.org/jira/browse/YARN-1786 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.3.0 >Reporter: shenhong > > {code} > testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624) > > testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > > testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions) > Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: > application finish time is not greater then 0 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Attachment: yarn-647-2.patch add a new patch > historyServer can't show container's log when aggregation is not enabled > > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.7, 2.0.4-alpha, 2.2.0 > Environment: yarn.log-aggregation-enable=false , HistoryServer will > show like this: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong >Assignee: shenhong > Attachments: yarn-647-2.patch, yarn-647.patch > > > When yarn.log-aggregation-enable is seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable is seted to false. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857414#comment-13857414 ] shenhong commented on YARN-647: --- Thanks Zhijie! Like caolong, we also set yarn.nodemanager.log.retain-seconds=259200, so NM local logs won't be deleted after container stop, I think if yarn.log-aggregation-enable=false and yarn.nodemanager.log.retain-seconds>0, we can change the logsLink . > historyServer can't show container's log when aggregation is not enabled > > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 0.23.7, 2.0.4-alpha, 2.2.0 > Environment: yarn.log-aggregation-enable=false , HistoryServer will > show like this: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong >Assignee: shenhong > Attachments: yarn-647.patch > > > When yarn.log-aggregation-enable is seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable is seted to false. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
shenhong created YARN-1537: -- Summary: TestLocalResourcesTrackerImpl.testLocalResourceCache often failed Key: YARN-1537 URL: https://issues.apache.org/jira/browse/YARN-1537 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Reporter: shenhong -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
[ https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1537: --- Description: Here is the error log {code} Results : Failed tests: TestLocalResourcesTrackerImpl.testLocalResourceCache:351 Wanted but not invoked: eventHandler.handle( isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent) ); -> at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351) However, there were other interactions with this mock: -> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) -> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) {code} > TestLocalResourcesTrackerImpl.testLocalResourceCache often failed > - > > Key: YARN-1537 > URL: https://issues.apache.org/jira/browse/YARN-1537 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: shenhong > > Here is the error log > {code} > Results : > Failed tests: > TestLocalResourcesTrackerImpl.testLocalResourceCache:351 > Wanted but not invoked: > eventHandler.handle( > > isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent) > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351) > However, there were other interactions with this mock: > -> at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) > -> at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1472) TestCase failed at username with “.”
[ https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840024#comment-13840024 ] shenhong commented on YARN-1472: It will couse username with '.' failed this testcase. It should change to: {code} String regex = CONF_HADOOP_PROXYUSER_RE+"*\\"+CONF_GROUPS; {code} > TestCase failed at username with “.” > > > Key: YARN-1472 > URL: https://issues.apache.org/jira/browse/YARN-1472 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: shenhong > > When I run a testcase by user yuling.sh : > {code} > mvn test -Dtest=TestFileSystemAccessService#createFileSystem > {code} > it will failed with exception : > {code} > org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to > impersonate u > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at org.apache.hadoop.ipc.Client.call(Client.java:1365) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) > at $Proxy17.mkdirs(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > ... > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1472) TestCase failed at username with “.”
[ https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838669#comment-13838669 ] shenhong commented on YARN-1472: The reason is at org.apache.hadoop.security.authorize.ProxyUsers: {code} String regex = CONF_HADOOP_PROXYUSER_RE+"[^.]*\\"+CONF_GROUPS; Map allMatchKeys = conf.getValByRegex(regex); for(Entry entry : allMatchKeys.entrySet()) { proxyGroups.put(entry.getKey(), StringUtils.getStringCollection(entry.getValue())); } {code} regex is only user without "." , Why ? > TestCase failed at username with “.” > > > Key: YARN-1472 > URL: https://issues.apache.org/jira/browse/YARN-1472 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: shenhong > > When I run a testcase by user yuling.sh : > {code} > mvn test -Dtest=TestFileSystemAccessService#createFileSystem > {code} > it will failed with exception : > {code} > org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to > impersonate u > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at org.apache.hadoop.ipc.Client.call(Client.java:1365) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) > at $Proxy17.mkdirs(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > ... > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1472) TestCase failed at username with “.”
[ https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1472: --- Description: When I run a testcase by user yuling.sh : {code} mvn test -Dtest=TestFileSystemAccessService#createFileSystem {code} it will failed with exception : {code} org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to impersonate u at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.Client.call(Client.java:1365) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) at $Proxy17.mkdirs(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... {code} was: When I run a testcase by user yuling.sh : mvn test -Dtest=TestFileSystemAccessService#createFileSystem it will failed with exception : org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to impersonate u at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.Client.call(Client.java:1365) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) at $Proxy17.mkdirs(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... > TestCase failed at username with “.” > > > Key: YARN-1472 > URL: https://issues.apache.org/jira/browse/YARN-1472 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: shenhong > > When I run a testcase by user yuling.sh : > {code} > mvn test -Dtest=TestFileSystemAccessService#createFileSystem > {code} > it will failed with exception : > {code} > org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to > impersonate u > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at org.apache.hadoop.ipc.Client.call(Client.java:1365) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) > at $Proxy17.mkdirs(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > ... > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1472) TestCase failed at username with “.”
[ https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1472: --- Description: When I run a testcase by user yuling.sh : mvn test -Dtest=TestFileSystemAccessService#createFileSystem it will failed with exception : org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to impersonate u at org.apache.hadoop.ipc.Client.call(Client.java:1412) at org.apache.hadoop.ipc.Client.call(Client.java:1365) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) at $Proxy17.mkdirs(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) ... was: When I run a testcase by user yuling.sh : mvn test -Dtest=TestFileSystemAccessService#createFileSystem it will failed with exception : Running org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.236 sec <<< FAILURE! - in org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService createFileSystem(org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService) Time elapsed: 5.14 sec <<< ERROR! org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to impersonate u at org.apache.hadoop.ipc.Client.call(Client.java:1412) > TestCase failed at username with “.” > > > Key: YARN-1472 > URL: https://issues.apache.org/jira/browse/YARN-1472 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: shenhong > > When I run a testcase by user yuling.sh : > mvn test -Dtest=TestFileSystemAccessService#createFileSystem > it will failed with exception : > org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to > impersonate u > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at org.apache.hadoop.ipc.Client.call(Client.java:1365) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) > at $Proxy17.mkdirs(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > ... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1472) TestCase failed at username with “.”
shenhong created YARN-1472: -- Summary: TestCase failed at username with “.” Key: YARN-1472 URL: https://issues.apache.org/jira/browse/YARN-1472 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0 Reporter: shenhong When I run a testcase by user yuling.sh : mvn test -Dtest=TestFileSystemAccessService#createFileSystem it will failed with exception : Running org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.236 sec <<< FAILURE! - in org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService createFileSystem(org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService) Time elapsed: 5.14 sec <<< ERROR! org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to impersonate u at org.apache.hadoop.ipc.Client.call(Client.java:1412) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Affects Version/s: 2.2.0 > Userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: shenhong > > At present, userlog hadn't delete at NodeManager start, I thank we should > delete userlog before NM start. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Description: At present, userlog hadn't delete at NodeManager start, I thank we should delete userlog before NM start. (was: At present, userlog hadn't delete at NodeManager start, should we delete then?) > Userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: shenhong > > At present, userlog hadn't delete at NodeManager start, I thank we should > delete userlog before NM start. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Component/s: nodemanager > Userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: shenhong > > At present, userlog hadn't delete at NodeManager start, I thank we should > delete userlog before NM start. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Description: At present, userlog are hadn't delete at NodeManager start, should we delete then? (was: At present, userlog should be delete at NodeManager start, ) > Userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug >Reporter: shenhong > > At present, userlog are hadn't delete at NodeManager start, should we delete > then? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Description: At present, userlog hadn't delete at NodeManager start, should we delete then? (was: At present, userlog are hadn't delete at NodeManager start, should we delete then?) > Userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug >Reporter: shenhong > > At present, userlog hadn't delete at NodeManager start, should we delete > then? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Description: At present, userlog should be delete at NodeManager start, > userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug >Reporter: shenhong > > At present, userlog should be delete at NodeManager start, -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Summary: Userlog hadn't delete at NodeManager start (was: userlog hadn't delete at NodeManager start) > Userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug >Reporter: shenhong > > At present, userlog should be delete at NodeManager start, -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1313) userlog hadn't delete at NodeManager start
[ https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1313: --- Summary: userlog hadn't delete at NodeManager start (was: userlog not delete at NodeManager start) > userlog hadn't delete at NodeManager start > -- > > Key: YARN-1313 > URL: https://issues.apache.org/jira/browse/YARN-1313 > Project: Hadoop YARN > Issue Type: Bug >Reporter: shenhong > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1313) userlog not delete at NodeManager start
shenhong created YARN-1313: -- Summary: userlog not delete at NodeManager start Key: YARN-1313 URL: https://issues.apache.org/jira/browse/YARN-1313 Project: Hadoop YARN Issue Type: Bug Reporter: shenhong -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-149: -- Assignee: (was: shenhong) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J > Attachments: rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong reassigned YARN-149: - Assignee: shenhong (was: Bikas Saha) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Harsh J >Assignee: shenhong > Attachments: rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM
[ https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738647#comment-13738647 ] shenhong commented on YARN-435: --- Firstly, if AM get all nodes in the cluster including their rack information by calling RM. This will increase pressure on the RM's network. For example, the cluster had more than 5000 datanodes. Secondly, if the yarn cluster only has 100 nodemanagers, but the hdfs it accessed is a cluster with more than 5000 datanodes, we can't get all the nodes including their rack information. However, AM need all the datanode information in it's job.splitmetainfo file, in order to init TaskAttempt. In this case, we can't get all nodes by calling RM. > Make it easier to access cluster topology information in an AM > -- > > Key: YARN-435 > URL: https://issues.apache.org/jira/browse/YARN-435 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Hitesh Shah >Assignee: Omkar Vinit Joshi > > ClientRMProtocol exposes a getClusterNodes api that provides a report on all > nodes in the cluster including their rack information. > However, this requires the AM to open and establish a separate connection to > the RM in addition to one for the AMRMProtocol. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-1062) MRAppMaster take a long time to init taskAttempt
[ https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong resolved YARN-1062. Resolution: Duplicate > MRAppMaster take a long time to init taskAttempt > > > Key: YARN-1062 > URL: https://issues.apache.org/jira/browse/YARN-1062 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 0.23.6 >Reporter: shenhong > > In our cluster, MRAppMaster take a long time to init taskAttempt, the > following log last one minute, > 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to > /r03b05 > 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to > /r03b02 > 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > The reason is: resolved one host to rack almost take 25ms (We resolve the > host to rack by a python script). Our hdfs cluster is more than 4000 > datanodes, then a large input job will take a long time to init TaskAttempt. > Is there any good idea to solve this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1062) MRAppMaster take a long time to init taskAttempt
[ https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738596#comment-13738596 ] shenhong commented on YARN-1062: Thanks Vinod Kumar Vavilapalli, I think YARN-435 is okay to me. > MRAppMaster take a long time to init taskAttempt > > > Key: YARN-1062 > URL: https://issues.apache.org/jira/browse/YARN-1062 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 0.23.6 >Reporter: shenhong > > In our cluster, MRAppMaster take a long time to init taskAttempt, the > following log last one minute, > 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to > /r03b05 > 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to > /r03b02 > 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > The reason is: resolved one host to rack almost take 25ms (We resolve the > host to rack by a python script). Our hdfs cluster is more than 4000 > datanodes, then a large input job will take a long time to init TaskAttempt. > Is there any good idea to solve this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1062) MRAppMaster take a long time to init taskAttempt
[ https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-1062: --- Description: In our cluster, MRAppMaster take a long time to init taskAttempt, the following log last one minute, 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to /r03b05 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to UNASSIGNED 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to /r03b02 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to /r02f02 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to /r02f02 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to UNASSIGNED The reason is: resolved one host to rack almost take 25ms (We resolve the host to rack by a python script). Our hdfs cluster is more than 4000 datanodes, then a large input job will take a long time to init TaskAttempt. Is there any good idea to solve this problem. was: In our cluster, MRAppMaster take a long time to init taskAttempt, the following log last one minute, 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to /r03b05 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to UNASSIGNED The reason is: resolved one host to rack almost take 25ms, our hdfs cluster is more than 4000 datanodes, then a large input job will take a long time to init TaskAttempt. Is there any good idea to solve this problem. > MRAppMaster take a long time to init taskAttempt > > > Key: YARN-1062 > URL: https://issues.apache.org/jira/browse/YARN-1062 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 0.23.6 >Reporter: shenhong > > In our cluster, MRAppMaster take a long time to init taskAttempt, the > following log last one minute, > 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to > /r01f11 > 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to > /r03b05 > 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to > /r03b02 > 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to > /r02f02 > 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: > attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to > UNASSIGNED > The reason is: resolved one host to rack almost take 25ms (We resolve the > host to rack by a python script). Our hdfs cluster is more than 4000 > datanodes, then a large input job will take a long time to init TaskAttempt. > Is there any good idea to solve this
[jira] [Created] (YARN-1062) MRAppMaster take a long time to init taskAttempt
shenhong created YARN-1062: -- Summary: MRAppMaster take a long time to init taskAttempt Key: YARN-1062 URL: https://issues.apache.org/jira/browse/YARN-1062 Project: Hadoop YARN Issue Type: Bug Components: applications Affects Versions: 0.23.6 Reporter: shenhong In our cluster, MRAppMaster take a long time to init taskAttempt, the following log last one minute, 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to /r01f11 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to /r03b05 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to UNASSIGNED The reason is: resolved one host to rack almost take 25ms, our hdfs cluster is more than 4000 datanodes, then a large input job will take a long time to init TaskAttempt. Is there any good idea to solve this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: When yarn.log-aggregation-enable is seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable is seted to false. was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. > historyServer can't show container's log when aggregation is not enabled > > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: yarn.log-aggregation-enable=false , HistoryServer will > show like this: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > Attachments: yarn-647.patch > > > When yarn.log-aggregation-enable is seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable is seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Attachment: yarn-647.patch add a patch > historyServer can't show container's log when aggregation is not enabled > > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: yarn.log-aggregation-enable=false , HistoryServer will > show like this: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > Attachments: yarn-647.patch > > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Summary: historyServer can't show container's log when aggregation is not enabled (was: historyServer can show container's log when aggregation is not enabled) > historyServer can't show container's log when aggregation is not enabled > > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: yarn.log-aggregation-enable=false , HistoryServer will > show like this: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Environment: yarn.log-aggregation-enable=false Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > historyServer can show container's log when aggregation is not enabled > -- > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: yarn.log-aggregation-enable=false > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Environment: yarn.log-aggregation-enable=false , HistoryServer will show like this: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 was: yarn.log-aggregation-enable=false Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > historyServer can show container's log when aggregation is not enabled > -- > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: yarn.log-aggregation-enable=false , HistoryServer will > show like this: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. was: Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. > historyServer can show container's log when aggregation is not enabled > -- > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: When yarn.log-aggregation-enable was seted to false, > after a MR_App complete, we can't view the container's log from the > HistoryServer, it shows message like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. > historyServer can show container's log when aggregation is not enabled > -- > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: When yarn.log-aggregation-enable was seted to false, > after a MR_App complete, we can't view the container's log from the > HistoryServer, it shows message like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. > historyServer can show container's log when aggregation is not enabled > -- > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: When yarn.log-aggregation-enable was seted to false, > after a MR_App complete, we can't view the container's log from the > HistoryServer, it shows message like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 >Reporter: shenhong > > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. > historyServer can show container's log when aggregation is not enabled > -- > > Key: YARN-647 > URL: https://issues.apache.org/jira/browse/YARN-647 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 0.23.7, 2.0.4-alpha > Environment: When yarn.log-aggregation-enable was seted to false, > after a MR_App complete, we can't view the container's log from the > HistoryServer, it shows message like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. >Reporter: shenhong > > When yarn.log-aggregation-enable was seted to false, after a MR_App complete, > we can't view the container's log from the HistoryServer, it shows message > like: > Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 > Since we don't want to aggregate the container's log, because it will be a > pressure to namenode. but sometimes we also want to take a look at > container's log. > Should we show the container's log across HistoryServer even if > yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-647) historyServer can show container's log when aggregation is not enabled
shenhong created YARN-647: - Summary: historyServer can show container's log when aggregation is not enabled Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.0.4-alpha, 0.23.7 Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. Reporter: shenhong -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-622) Many deprecated warn messages in hadoop 2.0 when running sleepJob
[ https://issues.apache.org/jira/browse/YARN-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643881#comment-13643881 ] shenhong commented on YARN-622: --- When set a property, if the property has deprecated name, it will also set the deprecated name. {{{ String[] altNames = getAlternateNames(name); if (altNames != null && altNames.length > 0) { String altSource = "because " + name + " is deprecated"; for(String altName : altNames) { if(!altName.equals(name)) { getOverlay().setProperty(altName, value); getProps().setProperty(altName, value); updatingResource.put(altName, new String[] {altSource}); } } } }}} > Many deprecated warn messages in hadoop 2.0 when running sleepJob > - > > Key: YARN-622 > URL: https://issues.apache.org/jira/browse/YARN-622 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha > Environment: Run a sleep job in hadoop-2.0.4-alpha >Reporter: shenhong > > hadoop jar > share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.4-alpha.jar s > leep -m 1 > 13/04/28 10:16:46 INFO util.Shell: setsid exited with exit code 0 > 13/04/28 10:16:46 WARN conf.Configuration: session.id is deprecated. Instead, > use dfs.metrics.session-id > 13/04/28 10:16:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with > processName=JobTracker, sessionId= > 13/04/28 10:16:46 INFO mapreduce.JobSubmitter: number of splits:1 > 13/04/28 10:16:46 WARN conf.Configuration: mapred.jar is deprecated. Instead, > use mapreduce.job.jar > 13/04/28 10:16:46 WARN conf.Configuration: > mapred.map.tasks.speculative.execution is deprecated. Instead, use > mapreduce.map.speculative > 13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks is deprecated. > Instead, use mapreduce.job.reduces > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.partitioner.class is > deprecated. Instead, use mapreduce.job.partitioner.class > 13/04/28 10:16:46 WARN conf.Configuration: > mapred.reduce.tasks.speculative.execution is deprecated. Instead, use > mapreduce.reduce.speculative > 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.value.class is > deprecated. Instead, use mapreduce.map.output.value.class > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.map.class is deprecated. > Instead, use mapreduce.job.map.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.job.name is deprecated. > Instead, use mapreduce.job.name > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.reduce.class is > deprecated. Instead, use mapreduce.job.reduce.class > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.inputformat.class is > deprecated. Instead, use mapreduce.job.inputformat.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.input.dir is deprecated. > Instead, use mapreduce.input.fileinputformat.inputdir > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.outputformat.class is > deprecated. Instead, use mapreduce.job.outputformat.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks is deprecated. > Instead, use mapreduce.job.maps > 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.key.class is > deprecated. Instead, use mapreduce.map.output.key.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.working.dir is deprecated. > Instead, use mapreduce.job.working.dir -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-622) Many deprecated warn messages in hadoop 2.0 when running sleepJob
[ https://issues.apache.org/jira/browse/YARN-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643880#comment-13643880 ] shenhong commented on YARN-622: --- My mapred-site.xml is empty. I thank the reason is in org.apache.hadoop.conf.Configuration: public void set(String name, String value, String source) { if (deprecatedKeyMap.isEmpty()) { getProps(); } getOverlay().setProperty(name, value); getProps().setProperty(name, value); if(source == null) { updatingResource.put(name, new String[] {"programatically"}); } else { updatingResource.put(name, new String[] {source}); } String[] altNames = getAlternateNames(name); if (altNames != null && altNames.length > 0) { String altSource = "because " + name + " is deprecated"; for(String altName : altNames) { if(!altName.equals(name)) { getOverlay().setProperty(altName, value); getProps().setProperty(altName, value); updatingResource.put(altName, new String[] {altSource}); } } } warnOnceIfDeprecated(name); } > Many deprecated warn messages in hadoop 2.0 when running sleepJob > - > > Key: YARN-622 > URL: https://issues.apache.org/jira/browse/YARN-622 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.4-alpha > Environment: Run a sleep job in hadoop-2.0.4-alpha >Reporter: shenhong > > hadoop jar > share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.4-alpha.jar s > leep -m 1 > 13/04/28 10:16:46 INFO util.Shell: setsid exited with exit code 0 > 13/04/28 10:16:46 WARN conf.Configuration: session.id is deprecated. Instead, > use dfs.metrics.session-id > 13/04/28 10:16:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with > processName=JobTracker, sessionId= > 13/04/28 10:16:46 INFO mapreduce.JobSubmitter: number of splits:1 > 13/04/28 10:16:46 WARN conf.Configuration: mapred.jar is deprecated. Instead, > use mapreduce.job.jar > 13/04/28 10:16:46 WARN conf.Configuration: > mapred.map.tasks.speculative.execution is deprecated. Instead, use > mapreduce.map.speculative > 13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks is deprecated. > Instead, use mapreduce.job.reduces > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.partitioner.class is > deprecated. Instead, use mapreduce.job.partitioner.class > 13/04/28 10:16:46 WARN conf.Configuration: > mapred.reduce.tasks.speculative.execution is deprecated. Instead, use > mapreduce.reduce.speculative > 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.value.class is > deprecated. Instead, use mapreduce.map.output.value.class > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.map.class is deprecated. > Instead, use mapreduce.job.map.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.job.name is deprecated. > Instead, use mapreduce.job.name > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.reduce.class is > deprecated. Instead, use mapreduce.job.reduce.class > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.inputformat.class is > deprecated. Instead, use mapreduce.job.inputformat.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.input.dir is deprecated. > Instead, use mapreduce.input.fileinputformat.inputdir > 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.outputformat.class is > deprecated. Instead, use mapreduce.job.outputformat.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks is deprecated. > Instead, use mapreduce.job.maps > 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.key.class is > deprecated. Instead, use mapreduce.map.output.key.class > 13/04/28 10:16:46 WARN conf.Configuration: mapred.working.dir is deprecated. > Instead, use mapreduce.job.working.dir -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-622) Many deprecated warn messages in hadoop 2.0 when running sleepJob
shenhong created YARN-622: - Summary: Many deprecated warn messages in hadoop 2.0 when running sleepJob Key: YARN-622 URL: https://issues.apache.org/jira/browse/YARN-622 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Environment: Run a sleep job in hadoop-2.0.4-alpha Reporter: shenhong hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.4-alpha.jar s leep -m 1 13/04/28 10:16:46 INFO util.Shell: setsid exited with exit code 0 13/04/28 10:16:46 WARN conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 13/04/28 10:16:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 13/04/28 10:16:46 INFO mapreduce.JobSubmitter: number of splits:1 13/04/28 10:16:46 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.partitioner.class is deprecated. Instead, use mapreduce.job.partitioner.class 13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 13/04/28 10:16:46 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 13/04/28 10:16:46 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 13/04/28 10:16:46 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561390#comment-13561390 ] shenhong commented on YARN-326: --- Hi, Sandy, is there any plans, or when you plan to solve this problem. > Add multi-resource scheduling to the fair scheduler > --- > > Key: YARN-326 > URL: https://issues.apache.org/jira/browse/YARN-326 > Project: Hadoop YARN > Issue Type: New Feature > Components: scheduler >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > > With YARN-2 in, the capacity scheduler has the ability to schedule based on > multiple resources, using dominant resource fairness. The fair scheduler > should be able to do multiple resource scheduling as well, also using > dominant resource fairness. > More details to come on how the corner cases with fair scheduler configs such > as min and max resources will be handled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560312#comment-13560312 ] shenhong commented on YARN-319: --- > Would it be possible to use a synchronous event handler in the tests so that > we don't have to poll? I don't know how to do that. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong >Assignee: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319-1.patch, YARN-319-2.patch, YARN-319-3.patch, > YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-319: -- Attachment: YARN-319-3.patch fix indentation of annotations, rename app_id to appId , rename att_id to attId. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong >Assignee: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319-1.patch, YARN-319-2.patch, YARN-319-3.patch, > YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-319: -- Attachment: YARN-319-2.patch fix > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong >Assignee: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319-1.patch, YARN-319-2.patch, YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong reassigned YARN-319: - Assignee: shenhong > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong >Assignee: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319-1.patch, YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557932#comment-13557932 ] shenhong commented on YARN-319: --- Thanks for you help, Sandy Ryza. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319-1.patch, YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-319: -- Attachment: YARN-319-1.patch add a testcast > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319-1.patch, YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557251#comment-13557251 ] shenhong commented on YARN-319: --- I don't know the commond to run a test like TestFairscheduler, can anybody tell me how. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556929#comment-13556929 ] shenhong commented on YARN-319: --- Soorry, will add a testcase later today. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552144#comment-13552144 ] shenhong commented on YARN-319: --- Here is the log of ResourceManager: 2013-01-13 13:18:26,922 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: User yuling .sh cannot submit applications to queue root.cug-dev-tbdp 2013-01-13 13:18:26,924 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattemp t_1357617565562_0696_01 State change from SUBMITTED to FAILED 2013-01-13 13:18:26,924 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_135761756556 2_0696 State change from SUBMITTED to FAILED 2013-01-13 13:18:26,924 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yuling.sh OPER ATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILE D PERMISSIONS=User yuling.sh cannot submit applications to queue root.cug-dev-tbdpAPPID=application_13 57617565562_0696 2013-01-13 13:18:26,924 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=ap plication_1357617565562_0696,name=Sleep job,user=yuling.sh,queue=cug-dev-tbdp,state=FAILED,trackingUrl=hdpdevrm:5003 0/proxy/application_1357617565562_0696/,appMasterHost=N/A,startTime=1358054306921,finishTime=1358054306924 > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552143#comment-13552143 ] shenhong commented on YARN-319: --- Here is the log of yarn client: 13/01/13 13:18:26 ERROR security.UserGroupInformation: PriviledgedActionException as:yuling.sh cause:java.io.IOException: Failed to run job : User yuling.sh cannot submit applications to queue root.cug-dev-tbdp java.io.IOException: Failed to run job : User yuling.sh cannot submit applications to queue root.cug-dev-tbdp at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:391) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1266) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1236) at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69) at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:112) at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:120) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552142#comment-13552142 ] shenhong commented on YARN-319: --- Of course, Our version already includes this patch. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-319: -- Attachment: YARN-319.patch > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-319.patch > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
shenhong created YARN-319: - Summary: Submit a job to a queue that not allowed in fairScheduler, client will hold forever. Key: YARN-319 URL: https://issues.apache.org/jira/browse/YARN-319 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.2-alpha Reporter: shenhong Fix For: 2.0.3-alpha RM use fairScheduler, when client submit a job to a queue, but the queue do not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
[ https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545289#comment-13545289 ] shenhong commented on YARN-319: --- The reason is at FairScheduler#addApplication, if user cannot submit job to the queue, it return directly, we should create a RMAppAttemptRejectedEvent and handle. original: // Enforce ACLs UserGroupInformation userUgi = UserGroupInformation.createRemoteUser(user); if (!queue.hasAccess(QueueACL.SUBMIT_APPLICATIONS, userUgi)) { LOG.info("User " + userUgi.getUserName() + " cannot submit applications to queue " + queue.getName()); return; } after modification: // Enforce ACLs UserGroupInformation userUgi = UserGroupInformation.createRemoteUser(user); if (!queue.hasAccess(QueueACL.SUBMIT_APPLICATIONS, userUgi)) { String msg = "User " + userUgi.getUserName() + " cannot submit applications to queue " + queue.getName(); LOG.info(msg); rmContext.getDispatcher().getEventHandler().handle( new RMAppAttemptRejectedEvent(applicationAttemptId, msg)); return; } I will create a patch to fix it. > Submit a job to a queue that not allowed in fairScheduler, client will hold > forever. > > > Key: YARN-319 > URL: https://issues.apache.org/jira/browse/YARN-319 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > > RM use fairScheduler, when client submit a job to a queue, but the queue do > not allow the user to submit job it, in this case, client will hold forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-301) Fair scheduler throws ConcurrentModificationException when iterating over app's priorities
[ https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-301: -- Attachment: YARN-301.patch good idea! add a new patch. > Fair scheduler throws ConcurrentModificationException when iterating over > app's priorities > -- > > Key: YARN-301 > URL: https://issues.apache.org/jira/browse/YARN-301 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-301.patch, YARN-301.patch > > > In my test cluster, fairscheduler appear to concurrentModificationException > and RM crash, here is the message: > 2012-12-30 17:14:17,171 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) > at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-301) Fairscheduler appear to concurrentModificationException and RM crash
[ https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-301: -- Attachment: YARN-301.patch add a patch to solve this problem > Fairscheduler appear to concurrentModificationException and RM crash > > > Key: YARN-301 > URL: https://issues.apache.org/jira/browse/YARN-301 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Fix For: 2.0.3-alpha > > Attachments: YARN-301.patch > > > In my test cluster, fairscheduler appear to concurrentModificationException > and RM crash, here is the message: > 2012-12-30 17:14:17,171 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) > at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-301) Fairscheduler appear to concurrentModificationException and RM crash
[ https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541592#comment-13541592 ] shenhong commented on YARN-301: --- The reason is when the thread SchedulerEventDispatche assignContainer, it will get priorities from AppSchedulingInfo, at AppSchedulingInfo the code is: synchronized public Collection getPriorities() { return priorities; } but it just get the reference of priorities, in AppSchedulable#assignContainer, it traverse the priorities. // (not scheduled) in order to promote better locality. for (Priority priority : app.getPriorities()) { app.addSchedulingOpportunity(priority); ... On the other hand, when the RM processing the request from AM and update the priorities at AppSchedulingInfo#updateResourceRequests: if (asks == null) { asks = new HashMap(); this.requests.put(priority, asks); this.priorities.add(priority); } else if (updatePendingResources) { it turn out to concurrentModificationException. > Fairscheduler appear to concurrentModificationException and RM crash > > > Key: YARN-301 > URL: https://issues.apache.org/jira/browse/YARN-301 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Fix For: 2.0.3-alpha > > > In my test cluster, fairscheduler appear to concurrentModificationException > and RM crash, here is the message: > 2012-12-30 17:14:17,171 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) > at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-301) Fairscheduler appear to concurrentModificationException and RM crash
shenhong created YARN-301: - Summary: Fairscheduler appear to concurrentModificationException and RM crash Key: YARN-301 URL: https://issues.apache.org/jira/browse/YARN-301 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Reporter: shenhong Fix For: 2.0.3-alpha In my test cluster, fairscheduler appear to concurrentModificationException and RM crash, here is the message: 2012-12-30 17:14:17,171 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.util.ConcurrentModificationException at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-300: -- Description: After yarn-271, when yarn.scheduler.fair.max.assign<=0, when a node was been reserved, fairScheduler will infinite loop and not schedule any application. (was: After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been reserved, fairScheduler will infinite loop and not schedule any application.) > After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will > infinite loop and not schedule any application. > --- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Attachments: YARN-300.patch > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign<=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign<=0, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-300: -- Summary: After yarn-271, when yarn.scheduler.fair.max.assign<=0, fairscheduler will infinite loop and not schedule any application. (was: After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will infinite loop and not schedule any application.) > After yarn-271, when yarn.scheduler.fair.max.assign<=0, fairscheduler will > infinite loop and not schedule any application. > --- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Attachments: YARN-300.patch > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign<=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541590#comment-13541590 ] shenhong commented on YARN-300: --- At FairScheduler#nodeUpdate() the code is: if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; } so when the maxAssign <= 0, it will infinite loop, and default maxAssign=-1. > After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will > infinite loop and not schedule any application. > -- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Attachments: YARN-300.patch > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-300: -- Summary: After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will infinite loop and not schedule any application. (was: After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.) > After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will > infinite loop and not schedule any application. > --- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Attachments: YARN-300.patch > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541352#comment-13541352 ] shenhong commented on YARN-300: --- the method to solve is: when a node is reserved, break the loop. > After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will > infinite loop and not schedule any application. > -- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Attachments: YARN-300.patch > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-300: -- Attachment: YARN-300.patch > After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will > infinite loop and not schedule any application. > -- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Attachments: YARN-300.patch > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-300: -- Attachment: (was: YARN-300) > After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will > infinite loop and not schedule any application. > -- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.
[ https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-300: -- Attachment: YARN-300 > After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will > infinite loop and not schedule any application. > -- > > Key: YARN-300 > URL: https://issues.apache.org/jira/browse/YARN-300 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: shenhong > Labels: None > Fix For: 2.0.3-alpha > > Original Estimate: 10h > Remaining Estimate: 10h > > After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been > reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.
shenhong created YARN-300: - Summary: After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application. Key: YARN-300 URL: https://issues.apache.org/jira/browse/YARN-300 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Reporter: shenhong Fix For: 2.0.3-alpha After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been reserved, fairScheduler will infinite loop and not schedule any application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-271) Fair scheduler hits IllegalStateException trying to reserve different apps on same node
[ https://issues.apache.org/jira/browse/YARN-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541336#comment-13541336 ] shenhong commented on YARN-271: --- I found this patch will result to a infinite loop when maxAssign=0. I will create a jira to solve this bug. > Fair scheduler hits IllegalStateException trying to reserve different apps on > same node > --- > > Key: YARN-271 > URL: https://issues.apache.org/jira/browse/YARN-271 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.0.2-alpha >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Fix For: 2.0.3-alpha > > Attachments: YARN-271-1.patch, YARN-271.patch > > > After the fair scheduler reserves a container on a node, it doesn't check for > reservations it just made when trying to make more reservations during the > same heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira