from:"shenhong \(JIRA\)"

[jira] [Commented] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-05 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922006#comment-13922006
 ] 

shenhong commented on YARN-1786:


Thanks  Jian He. The code I used is from 
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.3.0/. 
But the trunk also have this bug.

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>Assignee: shenhong
> Attachments: YARN-1786.patch
>
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920624#comment-13920624
 ] 

shenhong commented on YARN-1786:


{code}
  @Test
  public void testAppAcceptedKill() throws IOException, InterruptedException {
LOG.info("--- START: testAppAcceptedKill ---");
RMApp application = testCreateAppAccepted(null);
// ACCEPTED => KILLED event RMAppEventType.KILL
RMAppEvent event = new RMAppEvent(application.getApplicationId(),
RMAppEventType.KILL);
application.handle(event);
rmDispatcher.await();
assertAppAndAttemptKilled(application);
  }
{code}
at assertAppAndAttemptKilled, it will call sendAttemptUpdateSavedEvent.
{code}
  private void assertAppAndAttemptKilled(RMApp application)
  throws InterruptedException {
sendAttemptUpdateSavedEvent(application);
sendAppUpdateSavedEvent(application);
assertKilled(application);
Assert.assertEquals(RMAppAttemptState.KILLED, application
  .getCurrentAppAttempt().getAppAttemptState());
assertAppFinalStateSaved(application);
  }
{code}


> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>Assignee: shenhong
> Attachments: YARN-1786.patch
>
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Attachment: YARN-1786.patch

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>Assignee: shenhong
> Attachments: YARN-1786.patch
>
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Attachment: (was: YARN-1786.patch)

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>Assignee: shenhong
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Description: 
TestRMAppTransitions often fail with "application finish time is not greater 
then 0", following is log:
{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}


  was:
TestRMAppTransitions often fail with "application finish time is not greater 
then 0", following is log:
{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}



> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>Assignee: shenhong
> Attachments: YARN-1786.patch
>
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! juni

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Attachment: YARN-1786.patch

Add a patch to fix the bug!

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
> Attachments: YARN-1786.patch
>
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong reassigned YARN-1786:
--

Assignee: shenhong

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>Assignee: shenhong
> Attachments: YARN-1786.patch
>
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920410#comment-13920410
 ] 

shenhong commented on YARN-1786:


Here is the code:
{code}
  private void sendAppUpdateSavedEvent(RMApp application) {
RMAppEvent event =
new RMAppUpdateSavedEvent(application.getApplicationId(), null);
application.handle(event);
rmDispatcher.await();
  }

  private void sendAttemptUpdateSavedEvent(RMApp application) {
application.getCurrentAppAttempt().handle(
  new RMAppAttemptUpdateSavedEvent(application.getCurrentAppAttempt()
.getAppAttemptId(), null));
  }
{code}
At sendAttemptUpdateSavedEvent(), there is no rmDispatcher.await() after handle 
an event, it should be changed to 
{code}
  private void sendAttemptUpdateSavedEvent(RMApp application) {
application.getCurrentAppAttempt().handle(
  new RMAppAttemptUpdateSavedEvent(application.getCurrentAppAttempt()
.getAppAttemptId(), null));
   rmDispatcher.await();
  }
{code}

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Description: 
TestRMAppTransitions often fail with "application finish time is not greater 
then 0", following is log:
{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}


  was:

{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}


> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>
> TestRMAppTransitions often fail with "application finish time is not greater 
> then 0", following is log:
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFin

[jira] [Created] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)

shenhong created YARN-1786:
--

 Summary: TestRMAppTransitions occasionally fail
 Key: YARN-1786
 URL: https://issues.apache.org/jira/browse/YARN-1786
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: shenhong






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Description: 

{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}

  was:
{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}


> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager

[jira] [Updated] (YARN-1786) TestRMAppTransitions occasionally fail

2014-03-04 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1786:
---

Description: 
{code}
testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
 
testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
 
testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
 Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
application finish time is not greater then 0 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
 at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
{code}

> TestRMAppTransitions occasionally fail
> --
>
> Key: YARN-1786
> URL: https://issues.apache.org/jira/browse/YARN-1786
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.3.0
>Reporter: shenhong
>
> {code}
> testAppAcceptedKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.04 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertAppAndAttemptKilled(TestRMAppTransitions.java:310)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppAcceptedKill(TestRMAppTransitions.java:624)
>  
> testAppRunningKill[0](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.033 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
>  
> testAppRunningKill[1](org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions)
>  Time elapsed: 0.036 sec <<< FAILURE! junit.framework.AssertionFailedError: 
> application finish time is not greater then 0 at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertTimesAtFinish(TestRMAppTransitions.java:283)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.assertKilled(TestRMAppTransitions.java:298)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppRunningKill(TestRMAppTransitions.java:646)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled

2013-12-27 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Attachment: yarn-647-2.patch

add a new patch

> historyServer can't show container's log when aggregation is not enabled
> 
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.7, 2.0.4-alpha, 2.2.0
> Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
> show like this:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>Assignee: shenhong
> Attachments: yarn-647-2.patch, yarn-647.patch
>
>
> When yarn.log-aggregation-enable is seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable is seted to false.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-647) historyServer can't show container's log when aggregation is not enabled

2013-12-27 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857414#comment-13857414
 ] 

shenhong commented on YARN-647:
---

Thanks Zhijie! 
Like caolong, we also set yarn.nodemanager.log.retain-seconds=259200, so NM 
local logs won't be deleted after container  stop, 
I think if yarn.log-aggregation-enable=false and 
yarn.nodemanager.log.retain-seconds>0, we can change the logsLink .

> historyServer can't show container's log when aggregation is not enabled
> 
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 0.23.7, 2.0.4-alpha, 2.2.0
> Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
> show like this:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>Assignee: shenhong
> Attachments: yarn-647.patch
>
>
> When yarn.log-aggregation-enable is seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable is seted to false.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2013-12-25 Thread shenhong (JIRA)

shenhong created YARN-1537:
--

 Summary: TestLocalResourcesTrackerImpl.testLocalResourceCache 
often failed
 Key: YARN-1537
 URL: https://issues.apache.org/jira/browse/YARN-1537
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: shenhong






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2013-12-25 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1537:
---

Description: 
Here is the error log
{code}
Results :

Failed tests: 
  TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
Wanted but not invoked:
eventHandler.handle(

isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
);
-> at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)

However, there were other interactions with this mock:
-> at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
-> at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)

{code}

> TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
> -
>
> Key: YARN-1537
> URL: https://issues.apache.org/jira/browse/YARN-1537
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> Here is the error log
> {code}
> Results :
> Failed tests: 
>   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
> Wanted but not invoked:
> eventHandler.handle(
> 
> isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
> );
> -> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
> However, there were other interactions with this mock:
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> -> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1472) TestCase failed at username with “.”

2013-12-05 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840024#comment-13840024
 ] 

shenhong commented on YARN-1472:


It will couse username with '.' failed this testcase. It should change to：
{code}
String regex = CONF_HADOOP_PROXYUSER_RE+"*\\"+CONF_GROUPS; 
{code}


> TestCase failed at username with “.”
> 
>
> Key: YARN-1472
> URL: https://issues.apache.org/jira/browse/YARN-1472
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> When I run a testcase by user yuling.sh  :
> {code}
>  mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
> {code}
> it will failed with exception :
> {code}
> org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
> impersonate u
>   at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1365)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>   at $Proxy17.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-1472) TestCase failed at username with “.”

2013-12-03 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838669#comment-13838669
 ] 

shenhong commented on YARN-1472:


The reason is at org.apache.hadoop.security.authorize.ProxyUsers:
{code}
String regex = CONF_HADOOP_PROXYUSER_RE+"[^.]*\\"+CONF_GROUPS;
Map allMatchKeys = conf.getValByRegex(regex);
for(Entry entry : allMatchKeys.entrySet()) {
  proxyGroups.put(entry.getKey(), 
  StringUtils.getStringCollection(entry.getValue()));
}
{code}
regex is only user without "." , Why ?

> TestCase failed at username with “.”
> 
>
> Key: YARN-1472
> URL: https://issues.apache.org/jira/browse/YARN-1472
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> When I run a testcase by user yuling.sh  :
> {code}
>  mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
> {code}
> it will failed with exception :
> {code}
> org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
> impersonate u
>   at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1365)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>   at $Proxy17.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1472) TestCase failed at username with “.”

2013-12-03 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1472:
---

Description: 
When I run a testcase by user yuling.sh  :
{code}
 mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
{code}
it will failed with exception :
{code}
org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
impersonate u
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.Client.call(Client.java:1365)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
at $Proxy17.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
...
{code}

  was:
When I run a testcase by user yuling.sh  :
 mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
it will failed with exception :

org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
impersonate u
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.Client.call(Client.java:1365)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
at $Proxy17.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
...



> TestCase failed at username with “.”
> 
>
> Key: YARN-1472
> URL: https://issues.apache.org/jira/browse/YARN-1472
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> When I run a testcase by user yuling.sh  :
> {code}
>  mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
> {code}
> it will failed with exception :
> {code}
> org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
> impersonate u
>   at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1365)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>   at $Proxy17.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1472) TestCase failed at username with “.”

2013-12-03 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1472:
---

Description: 
When I run a testcase by user yuling.sh  :
 mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
it will failed with exception :

org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
impersonate u
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.Client.call(Client.java:1365)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
at $Proxy17.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
...


  was:
When I run a testcase by user yuling.sh  :
 mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
it will failed with exception :
Running org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.236 sec <<< 
FAILURE! - in org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService
createFileSystem(org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService)
  Time elapsed: 5.14 sec  <<< ERROR!
org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
impersonate u
at org.apache.hadoop.ipc.Client.call(Client.java:1412)




> TestCase failed at username with “.”
> 
>
> Key: YARN-1472
> URL: https://issues.apache.org/jira/browse/YARN-1472
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> When I run a testcase by user yuling.sh  :
>  mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
> it will failed with exception :
> org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
> impersonate u
>   at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1365)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
>   at $Proxy17.mkdirs(Unknown Source)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
> ...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (YARN-1472) TestCase failed at username with “.”

2013-12-03 Thread shenhong (JIRA)

shenhong created YARN-1472:
--

 Summary: TestCase failed at username with “.”
 Key: YARN-1472
 URL: https://issues.apache.org/jira/browse/YARN-1472
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: shenhong


When I run a testcase by user yuling.sh  :
 mvn  test -Dtest=TestFileSystemAccessService#createFileSystem
it will failed with exception :
Running org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 5.236 sec <<< 
FAILURE! - in org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService
createFileSystem(org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService)
  Time elapsed: 5.14 sec  <<< ERROR!
org.apache.hadoop.ipc.RemoteException: User: yuling.sh is not allowed to 
impersonate u
at org.apache.hadoop.ipc.Client.call(Client.java:1412)





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Affects Version/s: 2.2.0

> Userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> At present, userlog hadn't delete at NodeManager start,  I thank we should 
> delete userlog before NM start.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Description: At present, userlog hadn't delete at NodeManager start,  I 
thank we should delete userlog before NM start.  (was: At present, userlog 
hadn't delete at NodeManager start,  should we delete then?)

> Userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> At present, userlog hadn't delete at NodeManager start,  I thank we should 
> delete userlog before NM start.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Component/s: nodemanager

> Userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: shenhong
>
> At present, userlog hadn't delete at NodeManager start,  I thank we should 
> delete userlog before NM start.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Description: At present, userlog are hadn't delete at NodeManager start,  
should we delete then?  (was: At present, userlog should be delete at 
NodeManager start, )

> Userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: shenhong
>
> At present, userlog are hadn't delete at NodeManager start,  should we delete 
> then?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Description: At present, userlog hadn't delete at NodeManager start,  
should we delete then?  (was: At present, userlog are hadn't delete at 
NodeManager start,  should we delete then?)

> Userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: shenhong
>
> At present, userlog hadn't delete at NodeManager start,  should we delete 
> then?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Description: At present, userlog should be delete at NodeManager start, 

> userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: shenhong
>
> At present, userlog should be delete at NodeManager start, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) Userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Summary: Userlog hadn't delete at NodeManager start  (was: userlog hadn't 
delete at NodeManager start)

> Userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: shenhong
>
> At present, userlog should be delete at NodeManager start, 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-1313) userlog hadn't delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1313:
---

Summary: userlog hadn't delete at NodeManager start  (was: userlog not 
delete at NodeManager start)

> userlog hadn't delete at NodeManager start
> --
>
> Key: YARN-1313
> URL: https://issues.apache.org/jira/browse/YARN-1313
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: shenhong
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (YARN-1313) userlog not delete at NodeManager start

2013-10-16 Thread shenhong (JIRA)

shenhong created YARN-1313:
--

 Summary: userlog not delete at NodeManager start
 Key: YARN-1313
 URL: https://issues.apache.org/jira/browse/YARN-1313
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: shenhong






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-09-15 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-149:
--

Assignee: (was: shenhong)

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
> Attachments: rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
> Failover-rev-08-04-13.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-149) ResourceManager (RM) High-Availability (HA)

2013-09-15 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong reassigned YARN-149:
-

Assignee: shenhong  (was: Bikas Saha)

> ResourceManager (RM) High-Availability (HA)
> ---
>
> Key: YARN-149
> URL: https://issues.apache.org/jira/browse/YARN-149
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Harsh J
>Assignee: shenhong
> Attachments: rm-ha-phase1-approach-draft1.pdf, 
> rm-ha-phase1-draft2.pdf, YARN ResourceManager Automatic 
> Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic 
> Failover-rev-08-04-13.pdf
>
>
> This jira tracks work needed to be done to support one RM instance failing 
> over to another RM instance so that we can have RM HA. Work includes leader 
> election, transfer of control to leader and client re-direction to new leader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2013-08-13 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738647#comment-13738647
 ] 

shenhong commented on YARN-435:
---

Firstly, if AM get all nodes in the cluster including their rack information by 
calling RM. This will increase pressure on the RM's network. For example, the 
cluster had more than 5000 datanodes.

Secondly, if the yarn cluster only has 100 nodemanagers, but the hdfs it 
accessed is a cluster with more than 5000 datanodes, we can't get all the nodes 
including their rack information. However, AM need all the datanode information 
in it's job.splitmetainfo file, in order to init TaskAttempt. In this case, we 
can't get all nodes by calling RM.

> Make it easier to access cluster topology information in an AM
> --
>
> Key: YARN-435
> URL: https://issues.apache.org/jira/browse/YARN-435
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Omkar Vinit Joshi
>
> ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
> nodes in the cluster including their rack information. 
> However, this requires the AM to open and establish a separate connection to 
> the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (YARN-1062) MRAppMaster take a long time to init taskAttempt

2013-08-13 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong resolved YARN-1062.


Resolution: Duplicate

> MRAppMaster take a long time to init taskAttempt
> 
>
> Key: YARN-1062
> URL: https://issues.apache.org/jira/browse/YARN-1062
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Affects Versions: 0.23.6
>Reporter: shenhong
>
> In our cluster, MRAppMaster take a long time to init taskAttempt, the 
> following log last one minute,
> 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to 
> /r01f11
> 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to 
> /r01f11
> 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to 
> /r03b05
> 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to 
> UNASSIGNED
> 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to 
> /r03b02
> 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to 
> /r02f02
> 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to 
> /r02f02
> 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to 
> UNASSIGNED
> The reason is: resolved one host to rack almost take 25ms (We resolve the 
> host to rack by a python script). Our hdfs cluster is more than 4000 
> datanodes, then a large input job will take a long time to init TaskAttempt.
> Is there any good idea to solve this problem. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-1062) MRAppMaster take a long time to init taskAttempt

2013-08-13 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738596#comment-13738596
 ] 

shenhong commented on YARN-1062:


Thanks Vinod Kumar Vavilapalli, I think  YARN-435 is okay to me.


> MRAppMaster take a long time to init taskAttempt
> 
>
> Key: YARN-1062
> URL: https://issues.apache.org/jira/browse/YARN-1062
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Affects Versions: 0.23.6
>Reporter: shenhong
>
> In our cluster, MRAppMaster take a long time to init taskAttempt, the 
> following log last one minute,
> 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to 
> /r01f11
> 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to 
> /r01f11
> 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to 
> /r03b05
> 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to 
> UNASSIGNED
> 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to 
> /r03b02
> 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to 
> /r02f02
> 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to 
> /r02f02
> 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to 
> UNASSIGNED
> The reason is: resolved one host to rack almost take 25ms (We resolve the 
> host to rack by a python script). Our hdfs cluster is more than 4000 
> datanodes, then a large input job will take a long time to init TaskAttempt.
> Is there any good idea to solve this problem. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-1062) MRAppMaster take a long time to init taskAttempt

2013-08-13 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-1062:
---

Description: 
In our cluster, MRAppMaster take a long time to init taskAttempt, the following 
log last one minute,

2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to 
/r01f11
2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to 
/r01f11
2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to 
/r03b05
2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to 
UNASSIGNED
2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to 
/r03b02
2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to 
/r02f02
2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to 
/r02f02
2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to 
UNASSIGNED

The reason is: resolved one host to rack almost take 25ms (We resolve the host 
to rack by a python script). Our hdfs cluster is more than 4000 datanodes, then 
a large input job will take a long time to init TaskAttempt.

Is there any good idea to solve this problem. 

  was:
In our cluster, MRAppMaster take a long time to init taskAttempt, the following 
log last one minute,

2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to 
/r01f11
2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to 
/r01f11
2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to 
/r03b05
2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to 
UNASSIGNED

The reason is: resolved one host to rack almost take 25ms, our hdfs cluster is 
more than 4000 datanodes, then a large input job will take a long time to init 
TaskAttempt.

Is there any good idea to solve this problem.


> MRAppMaster take a long time to init taskAttempt
> 
>
> Key: YARN-1062
> URL: https://issues.apache.org/jira/browse/YARN-1062
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Affects Versions: 0.23.6
>Reporter: shenhong
>
> In our cluster, MRAppMaster take a long time to init taskAttempt, the 
> following log last one minute,
> 2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to 
> /r01f11
> 2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to 
> /r01f11
> 2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to 
> /r03b05
> 2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to 
> UNASSIGNED
> 2013-07-17 11:28:06,415 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r03b02006.yh.aliyun.com to 
> /r03b02
> 2013-07-17 11:28:06,436 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02045.yh.aliyun.com to 
> /r02f02
> 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.yarn.util.RackResolver: Resolved r02f02034.yh.aliyun.com to 
> /r02f02
> 2013-07-17 11:28:06,457 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
> attempt_1373523419753_4543_m_01_0 TaskAttempt Transitioned from NEW to 
> UNASSIGNED
> The reason is: resolved one host to rack almost take 25ms (We resolve the 
> host to rack by a python script). Our hdfs cluster is more than 4000 
> datanodes, then a large input job will take a long time to init TaskAttempt.
> Is there any good idea to solve this

[jira] [Created] (YARN-1062) MRAppMaster take a long time to init taskAttempt

2013-08-13 Thread shenhong (JIRA)

shenhong created YARN-1062:
--

 Summary: MRAppMaster take a long time to init taskAttempt
 Key: YARN-1062
 URL: https://issues.apache.org/jira/browse/YARN-1062
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications
Affects Versions: 0.23.6
Reporter: shenhong


In our cluster, MRAppMaster take a long time to init taskAttempt, the following 
log last one minute,

2013-07-17 11:28:06,328 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11012.yh.aliyun.com to 
/r01f11
2013-07-17 11:28:06,357 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r01f11004.yh.aliyun.com to 
/r01f11
2013-07-17 11:28:06,383 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.util.RackResolver: Resolved r03b05042.yh.aliyun.com to 
/r03b05
2013-07-17 11:28:06,384 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: 
attempt_1373523419753_4543_m_00_0 TaskAttempt Transitioned from NEW to 
UNASSIGNED

The reason is: resolved one host to rack almost take 25ms, our hdfs cluster is 
more than 4000 datanodes, then a large input job will take a long time to init 
TaskAttempt.

Is there any good idea to solve this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Description: 
When yarn.log-aggregation-enable is seted to false, after a MR_App complete, we 
can't view the container's log from the HistoryServer, it shows message like:
Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable is seted to false.

  was:
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:
Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.


> historyServer can't show container's log when aggregation is not enabled
> 
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
> show like this:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
> Attachments: yarn-647.patch
>
>
> When yarn.log-aggregation-enable is seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable is seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Attachment: yarn-647.patch

add a patch

> historyServer can't show container's log when aggregation is not enabled
> 
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
> show like this:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
> Attachments: yarn-647.patch
>
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Summary: historyServer can't show container's log when aggregation is not 
enabled  (was: historyServer can show container's log when aggregation is not 
enabled)

> historyServer can't show container's log when aggregation is not enabled
> 
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
> show like this:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Environment: 
 yarn.log-aggregation-enable=false

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669


  was:
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669



> historyServer can show container's log when aggregation is not enabled
> --
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment:  yarn.log-aggregation-enable=false
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Environment: 
 yarn.log-aggregation-enable=false , HistoryServer will show like this:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669


  was:
 yarn.log-aggregation-enable=false

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669



> historyServer can show container's log when aggregation is not enabled
> --
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment:  yarn.log-aggregation-enable=false , HistoryServer will 
> show like this:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Description: 
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:
Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.

  was:


Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.


> historyServer can show container's log when aggregation is not enabled
> --
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment: When yarn.log-aggregation-enable was seted to false, 
> after a MR_App complete, we can't view the container's log from the 
> HistoryServer, it shows message like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Environment: 
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669


  was:
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log. 

Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.


> historyServer can show container's log when aggregation is not enabled
> --
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment: When yarn.log-aggregation-enable was seted to false, 
> after a MR_App complete, we can't view the container's log from the 
> HistoryServer, it shows message like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Description: 


Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.

  was:
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.


> historyServer can show container's log when aggregation is not enabled
> --
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment: When yarn.log-aggregation-enable was seted to false, 
> after a MR_App complete, we can't view the container's log from the 
> HistoryServer, it shows message like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
>Reporter: shenhong
>
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-647:
--

Description: 
When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
we can't view the container's log from the HistoryServer, it shows message like:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log.
Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.

> historyServer can show container's log when aggregation is not enabled
> --
>
> Key: YARN-647
> URL: https://issues.apache.org/jira/browse/YARN-647
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 0.23.7, 2.0.4-alpha
> Environment: When yarn.log-aggregation-enable was seted to false, 
> after a MR_App complete, we can't view the container's log from the 
> HistoryServer, it shows message like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log. 
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.
>Reporter: shenhong
>
> When yarn.log-aggregation-enable was seted to false, after a MR_App complete, 
> we can't view the container's log from the HistoryServer, it shows message 
> like:
> Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669
> Since we don't want to aggregate the container's log, because it will be a 
> pressure to namenode. but sometimes we also want to take a look at 
> container's log.
> Should we show the container's log across HistoryServer even if 
> yarn.log-aggregation-enable was seted to false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-647) historyServer can show container's log when aggregation is not enabled

2013-05-06 Thread shenhong (JIRA)

shenhong created YARN-647:
-

 Summary: historyServer can show container's log when aggregation 
is not enabled
 Key: YARN-647
 URL: https://issues.apache.org/jira/browse/YARN-647
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.4-alpha, 0.23.7
 Environment: When yarn.log-aggregation-enable was seted to false, 
after a MR_App complete, we can't view the container's log from the 
HistoryServer, it shows message like:

Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669

Since we don't want to aggregate the container's log, because it will be a 
pressure to namenode. but sometimes we also want to take a look at container's 
log. 

Should we show the container's log across HistoryServer even if 
yarn.log-aggregation-enable was seted to false.
Reporter: shenhong




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-622) Many deprecated warn messages in hadoop 2.0 when running sleepJob

2013-04-27 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643881#comment-13643881
 ] 

shenhong commented on YARN-622:
---

When set a property, if the property has deprecated name, it will also set the 
deprecated name.

{{{
String[] altNames = getAlternateNames(name);
if (altNames != null && altNames.length > 0) {
  String altSource = "because " + name + " is deprecated";
  for(String altName : altNames) {
if(!altName.equals(name)) {
  getOverlay().setProperty(altName, value);
  getProps().setProperty(altName, value);
  updatingResource.put(altName, new String[] {altSource});
}
  }
}
}}}

> Many deprecated warn messages in hadoop 2.0 when running sleepJob
> -
>
> Key: YARN-622
> URL: https://issues.apache.org/jira/browse/YARN-622
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
> Environment: Run a sleep job in hadoop-2.0.4-alpha
>Reporter: shenhong
>
> hadoop jar 
> share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.4-alpha.jar s
> leep -m 1
> 13/04/28 10:16:46 INFO util.Shell: setsid exited with exit code 0
> 13/04/28 10:16:46 WARN conf.Configuration: session.id is deprecated. Instead, 
> use dfs.metrics.session-id
> 13/04/28 10:16:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> 13/04/28 10:16:46 INFO mapreduce.JobSubmitter: number of splits:1
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/04/28 10:16:46 WARN conf.Configuration: 
> mapred.map.tasks.speculative.execution is deprecated. Instead, use 
> mapreduce.map.speculative
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks is deprecated. 
> Instead, use mapreduce.job.reduces
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.partitioner.class is 
> deprecated. Instead, use mapreduce.job.partitioner.class
> 13/04/28 10:16:46 WARN conf.Configuration: 
> mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
> mapreduce.reduce.speculative
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.value.class is 
> deprecated. Instead, use mapreduce.map.output.value.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.inputformat.class is 
> deprecated. Instead, use mapreduce.job.inputformat.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.outputformat.class is 
> deprecated. Instead, use mapreduce.job.outputformat.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.key.class is 
> deprecated. Instead, use mapreduce.map.output.key.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-622) Many deprecated warn messages in hadoop 2.0 when running sleepJob

2013-04-27 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643880#comment-13643880
 ] 

shenhong commented on YARN-622:
---

My mapred-site.xml is empty.
I thank the reason is in org.apache.hadoop.conf.Configuration:
public void set(String name, String value, String source) {
if (deprecatedKeyMap.isEmpty()) {
  getProps();
}
getOverlay().setProperty(name, value);
getProps().setProperty(name, value);
if(source == null) {
  updatingResource.put(name, new String[] {"programatically"});
} else {
  updatingResource.put(name, new String[] {source});
}
String[] altNames = getAlternateNames(name);
if (altNames != null && altNames.length > 0) {
  String altSource = "because " + name + " is deprecated";
  for(String altName : altNames) {
if(!altName.equals(name)) {
  getOverlay().setProperty(altName, value);
  getProps().setProperty(altName, value);
  updatingResource.put(altName, new String[] {altSource});
}
  }
}
warnOnceIfDeprecated(name);
  }


> Many deprecated warn messages in hadoop 2.0 when running sleepJob
> -
>
> Key: YARN-622
> URL: https://issues.apache.org/jira/browse/YARN-622
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.4-alpha
> Environment: Run a sleep job in hadoop-2.0.4-alpha
>Reporter: shenhong
>
> hadoop jar 
> share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.4-alpha.jar s
> leep -m 1
> 13/04/28 10:16:46 INFO util.Shell: setsid exited with exit code 0
> 13/04/28 10:16:46 WARN conf.Configuration: session.id is deprecated. Instead, 
> use dfs.metrics.session-id
> 13/04/28 10:16:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=JobTracker, sessionId=
> 13/04/28 10:16:46 INFO mapreduce.JobSubmitter: number of splits:1
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
> use mapreduce.job.jar
> 13/04/28 10:16:46 WARN conf.Configuration: 
> mapred.map.tasks.speculative.execution is deprecated. Instead, use 
> mapreduce.map.speculative
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks is deprecated. 
> Instead, use mapreduce.job.reduces
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.partitioner.class is 
> deprecated. Instead, use mapreduce.job.partitioner.class
> 13/04/28 10:16:46 WARN conf.Configuration: 
> mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
> mapreduce.reduce.speculative
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.value.class is 
> deprecated. Instead, use mapreduce.map.output.value.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.map.class is deprecated. 
> Instead, use mapreduce.job.map.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.job.name is deprecated. 
> Instead, use mapreduce.job.name
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.reduce.class is 
> deprecated. Instead, use mapreduce.job.reduce.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.inputformat.class is 
> deprecated. Instead, use mapreduce.job.inputformat.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.input.dir is deprecated. 
> Instead, use mapreduce.input.fileinputformat.inputdir
> 13/04/28 10:16:46 WARN conf.Configuration: mapreduce.outputformat.class is 
> deprecated. Instead, use mapreduce.job.outputformat.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks is deprecated. 
> Instead, use mapreduce.job.maps
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.key.class is 
> deprecated. Instead, use mapreduce.map.output.key.class
> 13/04/28 10:16:46 WARN conf.Configuration: mapred.working.dir is deprecated. 
> Instead, use mapreduce.job.working.dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-622) Many deprecated warn messages in hadoop 2.0 when running sleepJob

2013-04-27 Thread shenhong (JIRA)

shenhong created YARN-622:
-

 Summary: Many deprecated warn messages in hadoop 2.0 when running 
sleepJob
 Key: YARN-622
 URL: https://issues.apache.org/jira/browse/YARN-622
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.0.4-alpha
 Environment: Run a sleep job in hadoop-2.0.4-alpha

Reporter: shenhong


hadoop jar 
share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.4-alpha.jar s
leep -m 1

13/04/28 10:16:46 INFO util.Shell: setsid exited with exit code 0
13/04/28 10:16:46 WARN conf.Configuration: session.id is deprecated. Instead, 
use dfs.metrics.session-id
13/04/28 10:16:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=JobTracker, sessionId=
13/04/28 10:16:46 INFO mapreduce.JobSubmitter: number of splits:1
13/04/28 10:16:46 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
use mapreduce.job.jar
13/04/28 10:16:46 WARN conf.Configuration: 
mapred.map.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.map.speculative
13/04/28 10:16:46 WARN conf.Configuration: mapred.reduce.tasks is deprecated. 
Instead, use mapreduce.job.reduces
13/04/28 10:16:46 WARN conf.Configuration: mapreduce.partitioner.class is 
deprecated. Instead, use mapreduce.job.partitioner.class
13/04/28 10:16:46 WARN conf.Configuration: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.value.class is 
deprecated. Instead, use mapreduce.map.output.value.class
13/04/28 10:16:46 WARN conf.Configuration: mapreduce.map.class is deprecated. 
Instead, use mapreduce.job.map.class
13/04/28 10:16:46 WARN conf.Configuration: mapred.job.name is deprecated. 
Instead, use mapreduce.job.name
13/04/28 10:16:46 WARN conf.Configuration: mapreduce.reduce.class is 
deprecated. Instead, use mapreduce.job.reduce.class
13/04/28 10:16:46 WARN conf.Configuration: mapreduce.inputformat.class is 
deprecated. Instead, use mapreduce.job.inputformat.class
13/04/28 10:16:46 WARN conf.Configuration: mapred.input.dir is deprecated. 
Instead, use mapreduce.input.fileinputformat.inputdir
13/04/28 10:16:46 WARN conf.Configuration: mapreduce.outputformat.class is 
deprecated. Instead, use mapreduce.job.outputformat.class
13/04/28 10:16:46 WARN conf.Configuration: mapred.map.tasks is deprecated. 
Instead, use mapreduce.job.maps
13/04/28 10:16:46 WARN conf.Configuration: mapred.mapoutput.key.class is 
deprecated. Instead, use mapreduce.map.output.key.class
13/04/28 10:16:46 WARN conf.Configuration: mapred.working.dir is deprecated. 
Instead, use mapreduce.job.working.dir

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-01-23 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561390#comment-13561390
 ] 

shenhong commented on YARN-326:
---

Hi, Sandy, is there any plans, or when you plan to solve this problem.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-22 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13560312#comment-13560312
 ] 

shenhong commented on YARN-319:
---

 > Would it be possible to use a synchronous event handler in the tests so that 
 > we don't have to poll?
 I don't know how to do that.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
>Assignee: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319-1.patch, YARN-319-2.patch, YARN-319-3.patch, 
> YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-22 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-319:
--

Attachment: YARN-319-3.patch

fix indentation of annotations, rename app_id to appId , rename att_id to attId.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
>Assignee: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319-1.patch, YARN-319-2.patch, YARN-319-3.patch, 
> YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-20 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-319:
--

Attachment: YARN-319-2.patch

fix

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
>Assignee: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319-1.patch, YARN-319-2.patch, YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-20 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong reassigned YARN-319:
-

Assignee: shenhong

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
>Assignee: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319-1.patch, YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-18 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557932#comment-13557932
 ] 

shenhong commented on YARN-319:
---

Thanks for you help, Sandy Ryza.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319-1.patch, YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-18 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-319:
--

Attachment: YARN-319-1.patch

add a testcast

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319-1.patch, YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-18 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557251#comment-13557251
 ] 

shenhong commented on YARN-319:
---

I don't know the commond to run a test like TestFairscheduler, can anybody tell 
me how.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-17 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556929#comment-13556929
 ] 

shenhong commented on YARN-319:
---

Soorry, will add a testcase later today.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-12 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552144#comment-13552144
 ] 

shenhong commented on YARN-319:
---

Here is the log of ResourceManager:

2013-01-13 13:18:26,922 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
User yuling
.sh cannot submit applications to queue root.cug-dev-tbdp
2013-01-13 13:18:26,924 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattemp
t_1357617565562_0696_01 State change from SUBMITTED to FAILED
2013-01-13 13:18:26,924 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_135761756556
2_0696 State change from SUBMITTED to FAILED
2013-01-13 13:18:26,924 WARN 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yuling.sh 
   OPER
ATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE  
DESCRIPTION=App failed with state: FAILE
D   PERMISSIONS=User yuling.sh cannot submit applications to queue 
root.cug-dev-tbdpAPPID=application_13
57617565562_0696
2013-01-13 13:18:26,924 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: 
appId=ap
plication_1357617565562_0696,name=Sleep 
job,user=yuling.sh,queue=cug-dev-tbdp,state=FAILED,trackingUrl=hdpdevrm:5003
0/proxy/application_1357617565562_0696/,appMasterHost=N/A,startTime=1358054306921,finishTime=1358054306924

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-12 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552143#comment-13552143
 ] 

shenhong commented on YARN-319:
---

Here is the log of yarn client:

13/01/13 13:18:26 ERROR security.UserGroupInformation: 
PriviledgedActionException as:yuling.sh  cause:java.io.IOException: Failed to 
run job : User yuling.sh cannot submit applications to queue root.cug-dev-tbdp
java.io.IOException: Failed to run job : User yuling.sh cannot submit 
applications to queue root.cug-dev-tbdp
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:391)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1266)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1236)
at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:262)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:69)
at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at 
org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:112)
at 
org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-12 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552142#comment-13552142
 ] 

shenhong commented on YARN-319:
---

Of course, Our version already includes this patch.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-05 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-319:
--

Attachment: YARN-319.patch

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-319.patch
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-05 Thread shenhong (JIRA)

shenhong created YARN-319:
-

 Summary: Submit a job to a queue that not allowed in 
fairScheduler, client will hold forever.
 Key: YARN-319
 URL: https://issues.apache.org/jira/browse/YARN-319
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.0.2-alpha
Reporter: shenhong
 Fix For: 2.0.3-alpha


RM use fairScheduler, when client submit a job to a queue, but the queue do not 
allow the user to submit job it, in this case, client  will hold forever.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-319) Submit a job to a queue that not allowed in fairScheduler, client will hold forever.

2013-01-05 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545289#comment-13545289
 ] 

shenhong commented on YARN-319:
---

The reason is at FairScheduler#addApplication, if user cannot submit job to the 
queue, it return directly, we 

should create a RMAppAttemptRejectedEvent and handle.

original:
// Enforce ACLs
UserGroupInformation userUgi = UserGroupInformation.createRemoteUser(user);
if (!queue.hasAccess(QueueACL.SUBMIT_APPLICATIONS, userUgi)) {
  LOG.info("User " + userUgi.getUserName() +
  " cannot submit applications to queue " + queue.getName());
  return;
}

after modification:

// Enforce ACLs
UserGroupInformation userUgi = UserGroupInformation.createRemoteUser(user);
if (!queue.hasAccess(QueueACL.SUBMIT_APPLICATIONS, userUgi)) {
  String msg = "User " + userUgi.getUserName() +
  " cannot submit applications to queue " + queue.getName();
  LOG.info(msg);
  rmContext.getDispatcher().getEventHandler().handle(
  new RMAppAttemptRejectedEvent(applicationAttemptId, msg));
  return;
}

I will create a patch to fix it.

> Submit a job to a queue that not allowed in fairScheduler, client will hold 
> forever.
> 
>
> Key: YARN-319
> URL: https://issues.apache.org/jira/browse/YARN-319
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
>
> RM use fairScheduler, when client submit a job to a queue, but the queue do 
> not allow the user to submit job it, in this case, client  will hold forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-301) Fair scheduler throws ConcurrentModificationException when iterating over app's priorities

2013-01-01 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-301:
--

Attachment: YARN-301.patch

good idea! add a new patch.

> Fair scheduler throws ConcurrentModificationException when iterating over 
> app's priorities
> --
>
> Key: YARN-301
> URL: https://issues.apache.org/jira/browse/YARN-301
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-301.patch, YARN-301.patch
>
>
> In my test cluster, fairscheduler appear to concurrentModificationException 
> and RM crash,  here is the message:
> 2012-12-30 17:14:17,171 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340)
> at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-301) Fairscheduler appear to concurrentModificationException and RM crash

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-301:
--

Attachment: YARN-301.patch

add a patch to solve this problem

> Fairscheduler appear to concurrentModificationException and RM crash
> 
>
> Key: YARN-301
> URL: https://issues.apache.org/jira/browse/YARN-301
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-301.patch
>
>
> In my test cluster, fairscheduler appear to concurrentModificationException 
> and RM crash,  here is the message:
> 2012-12-30 17:14:17,171 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340)
> at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-301) Fairscheduler appear to concurrentModificationException and RM crash

2012-12-31 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541592#comment-13541592
 ] 

shenhong commented on YARN-301:
---

  The reason is when the thread SchedulerEventDispatche assignContainer, it 
will get priorities from AppSchedulingInfo, at AppSchedulingInfo the code is: 
  synchronized public Collection getPriorities() {
return priorities;
  }
but it just get the reference of priorities, in AppSchedulable#assignContainer, 
it traverse the priorities.
// (not scheduled) in order to promote better locality.
for (Priority priority : app.getPriorities()) {
  app.addSchedulingOpportunity(priority); 
...

  On the other hand, when the RM processing the request from AM and update the 
priorities at AppSchedulingInfo#updateResourceRequests:
  if (asks == null) {
asks = new HashMap();
this.requests.put(priority, asks);
this.priorities.add(priority);
  } else if (updatePendingResources) {

it turn out to concurrentModificationException.


> Fairscheduler appear to concurrentModificationException and RM crash
> 
>
> Key: YARN-301
> URL: https://issues.apache.org/jira/browse/YARN-301
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
> Fix For: 2.0.3-alpha
>
>
> In my test cluster, fairscheduler appear to concurrentModificationException 
> and RM crash,  here is the message:
> 2012-12-30 17:14:17,171 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340)
> at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-301) Fairscheduler appear to concurrentModificationException and RM crash

2012-12-31 Thread shenhong (JIRA)

shenhong created YARN-301:
-

 Summary: Fairscheduler appear to concurrentModificationException 
and RM crash
 Key: YARN-301
 URL: https://issues.apache.org/jira/browse/YARN-301
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: shenhong
 Fix For: 2.0.3-alpha


In my test cluster, fairscheduler appear to concurrentModificationException and 
RM crash,  here is the message:

2012-12-30 17:14:17,171 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340)
at java.lang.Thread.run(Thread.java:662)



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-300:
--

Description: After yarn-271, when yarn.scheduler.fair.max.assign<=0, when a 
node was been reserved, fairScheduler will  infinite loop and not schedule any 
application.  (was: After yarn-271, when yarn.scheduler.fair.max.assign=0, when 
a node was been reserved, fairScheduler will  infinite loop and not schedule 
any application.)

> After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will  
> infinite loop and not schedule any application.
> ---
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-300.patch
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign<=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign<=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-300:
--

Summary: After yarn-271, when yarn.scheduler.fair.max.assign<=0, 
fairscheduler will  infinite loop and not schedule any application.  (was: 
After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will  
infinite loop and not schedule any application.)

> After yarn-271, when yarn.scheduler.fair.max.assign<=0, fairscheduler will  
> infinite loop and not schedule any application.
> ---
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-300.patch
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign<=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541590#comment-13541590
 ] 

shenhong commented on YARN-300:
---

At FairScheduler#nodeUpdate() the code is:
if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; }
so when the maxAssign <= 0, it will infinite loop, and default maxAssign=-1.

> After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will  
> infinite loop and not schedule any application.
> --
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-300.patch
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-300:
--

Summary: After yarn-271, when yarn.scheduler.fair.max.assign=-1, 
fairscheduler will  infinite loop and not schedule any application.  (was: 
After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will  
infinite loop and not schedule any application.)

> After yarn-271, when yarn.scheduler.fair.max.assign=-1, fairscheduler will  
> infinite loop and not schedule any application.
> ---
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-300.patch
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541352#comment-13541352
 ] 

shenhong commented on YARN-300:
---

the method to solve is: when a node is reserved, break the loop.

> After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will  
> infinite loop and not schedule any application.
> --
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-300.patch
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-300:
--

Attachment: YARN-300.patch

> After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will  
> infinite loop and not schedule any application.
> --
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-300.patch
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-300:
--

Attachment: (was: YARN-300)

> After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will  
> infinite loop and not schedule any application.
> --
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenhong updated YARN-300:
--

Attachment: YARN-300

> After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will  
> infinite loop and not schedule any application.
> --
>
> Key: YARN-300
> URL: https://issues.apache.org/jira/browse/YARN-300
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Reporter: shenhong
>  Labels: None
> Fix For: 2.0.3-alpha
>
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
> reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-300) After yarn-271, when yarn.scheduler.fair.max.assign=0, fairscheduler will infinite loop and not schedule any application.

2012-12-31 Thread shenhong (JIRA)

shenhong created YARN-300:
-

 Summary: After yarn-271, when yarn.scheduler.fair.max.assign=0, 
fairscheduler will  infinite loop and not schedule any application.
 Key: YARN-300
 URL: https://issues.apache.org/jira/browse/YARN-300
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: shenhong
 Fix For: 2.0.3-alpha


After yarn-271, when yarn.scheduler.fair.max.assign=0, when a node was been 
reserved, fairScheduler will  infinite loop and not schedule any application.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-271) Fair scheduler hits IllegalStateException trying to reserve different apps on same node

2012-12-31 Thread shenhong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541336#comment-13541336
 ] 

shenhong commented on YARN-271:
---

I found this patch will result to a infinite loop when maxAssign=0. I will 
create a jira to solve this bug.

> Fair scheduler hits IllegalStateException trying to reserve different apps on 
> same node
> ---
>
> Key: YARN-271
> URL: https://issues.apache.org/jira/browse/YARN-271
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.3-alpha
>
> Attachments: YARN-271-1.patch, YARN-271.patch
>
>
> After the fair scheduler reserves a container on a node, it doesn't check for 
> reservations it just made when trying to make more reservations during the 
> same heartbeat.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

79 matches

Mail list logo