[ https://issues.apache.org/jira/browse/HDFS-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209237#comment-15209237 ]
Andrew Wang commented on HDFS-10197: ------------------------------------ Thanks for filing this JIRA [~linyiqun]! I think we can just bump the 60000 to 120000 directly, if we want to pursue this path. However, do you have any ideas about why the tests take so long to run? A better solution is to optimize the tests to run faster. > TestFsDatasetCache failing intermittently due to timeout > -------------------------------------------------------- > > Key: HDFS-10197 > URL: https://issues.apache.org/jira/browse/HDFS-10197 > Project: Hadoop HDFS > Issue Type: Bug > Components: test > Reporter: Lin Yiqun > Assignee: Lin Yiqun > Attachments: HDFS-10197.001.patch > > > In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some > failed reason in recent jenkins reports. They are all timeout errors. > {code} > Tests in error: > TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out > wait... > TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition. > Thr... > {code} > {code} > Tests in error: > TestFsDatasetCache.testPageRounder:474 ? test timed out after 60000 > milliseco... > TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ? > test ... > {code} > But there was a little different between these failure. > * The first because the total block time was exceed the > {{waitTimeMillis}}(here is 60s) then throw the timeout exception and print > thread diagnostic string in method {{DFSTestUtil#verifyExpectedCacheUsage}}. > {code} > long st = Time.now(); > do { > boolean result = check.get(); > if (result) { > return; > } > > Thread.sleep(checkEveryMillis); > } while (Time.now() - st < waitForMillis); > > throw new TimeoutException("Timed out waiting for condition. " + > "Thread diagnostics:\n" + > TimedOutTestsListener.buildThreadDiagnosticString()); > {code} > * The second is due to test elapsed time more than timeout time setting. Like > in {{TestFsDatasetCache#testPageRounder}}. > We should adjust timeout time for these unit test which would failed > sometimes due to timeout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)