Lin Yiqun created HDFS-10197:
--------------------------------

             Summary: TestFsDatasetCache failing intermittently due to timeout
                 Key: HDFS-10197
                 URL: https://issues.apache.org/jira/browse/HDFS-10197
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: test
            Reporter: Lin Yiqun
            Assignee: Lin Yiqun


In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some 
failed reason in recent jenkins reports. They are all timeout errors.
{code}
Tests in error: 
  TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out 
wait...
  TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition. 
Thr...
{code}
{code}
Tests in error: 
  TestFsDatasetCache.testPageRounder:474 ?  test timed out after 60000 
milliseco...
  TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ?  test 
...
{code}
But there was a little different between these failure.

* The first because the total block time was exceed the {{waitTimeMillis}}(here 
is 60s) and then throw the timeout exception and print thread diagnostic string.
{code}
    long st = Time.now();
    do {
      boolean result = check.get();
      if (result) {
        return;
      }
      
      Thread.sleep(checkEveryMillis);
    } while (Time.now() - st < waitForMillis);
    
    throw new TimeoutException("Timed out waiting for condition. " +
        "Thread diagnostics:\n" +
        TimedOutTestsListener.buildThreadDiagnosticString());
{code}

* The second is due to test elapsed time more than timeout time setting. Like 
in {{TestFsDatasetCache#testPageRounder}}.

We should adjust timeout time for these unit test which would failed sometimes 
due to timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to