[ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310976#comment-15310976 ]
Eric Badger commented on YARN-1468: ----------------------------------- [~mitdesai], I saw this test failing in the same way that you described above. I took a look at the test and I either don't understand the meaning of one of the lines or it's a bug. The following piece of code (minus the assertEquals) was added by [YARN-1493|https://issues.apache.org/jira/browse/YARN-1493] and doesn't make sense to me. Why are we checking the size against 2 when we are checking it against 4 immediately after? In my local tests, this loop times out once timeoutSecs >= 40 since rmApp.getAttempts.size() is equal to 4 the whole time. This leads me to believe that the assert failure would occur when this loop is executed and the size is actually equal to 2 initially. That way it would break out of the loop early and only get up to 3 (or stay at 2) before the assertEquals against 4 is executed. {noformat} // wait for the attempt to be created. int timeoutSecs = 0; while (rmApp.getAppAttempts().size() != 2 && timeoutSecs++ < 40) { Thread.sleep(200); } Assert.assertEquals(4, rmApp.getAppAttempts().size()); {noformat} I think changing ".size() != 2" to ".size() != 4" will fix this race in the test. Thoughts? cc [~djp] > TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed. > ---------------------------------------------------------------- > > Key: YARN-1468 > URL: https://issues.apache.org/jira/browse/YARN-1468 > Project: Hadoop YARN > Issue Type: Test > Components: resourcemanager > Reporter: Junping Du > Assignee: Junping Du > Priority: Critical > > Log is as following: > {code} > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart > testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) > Time elapsed: 44.197 sec <<< FAILURE! > junit.framework.AssertionFailedError: AppAttempt state is not correct > (timedout) expected:<ALLOCATED> but was:<SCHEDULED> > at junit.framework.Assert.fail(Assert.java:50) > at junit.framework.Assert.failNotEquals(Assert.java:287) > at junit.framework.Assert.assertEquals(Assert.java:67) > at > org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org