Wei-Chiu Chuang created HDFS-9631:
-------------------------------------

             Summary: Restarting namenode after deleting a directory with 
snapshot will fail
                 Key: HDFS-9631
                 URL: https://issues.apache.org/jira/browse/HDFS-9631
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Wei-Chiu Chuang
            Assignee: Wei-Chiu Chuang


I found a number of TestOpenFilesWithSnapshot tests failed quite frequently. 
These tests (testParentDirWithUCFileDeleteWithSnapshot, 
testOpenFilesWithRename, testWithCheckpoint) are unable to reconnect to the 
namenode after restart. It looks like the reconnection failed due to an 
EOFException between data node and the name node.

It appears that these three tests all call doWriteAndAbort(), which creates 
files and then abort, and then set the parent directory with a snapshot, and 
then delete the parent directory. 

Interestingly, if the parent directory does not have a snapshot, the tests will 
not fail.

The following test will fail intermittently:
{code:java}
public void testDeleteParentDirWithSnapShot() throws Exception {
    Path path = new Path("/test");
    fs.mkdirs(path);
    fs.allowSnapshot(path);
    Path file = new Path("/test/test/test2");
    FSDataOutputStream out = fs.create(file);
    for (int i = 0; i < 2; i++) {
      long count = 0;
      while (count < 1048576) {
        out.writeBytes("hell");
        count += 4;
      }
    }
    ((DFSOutputStream) out.getWrappedStream()).hsync(EnumSet
        .of(SyncFlag.UPDATE_LENGTH));
    DFSTestUtil.abortStream((DFSOutputStream) out.getWrappedStream());

    Path file2 = new Path("/test/test/test3");
    FSDataOutputStream out2 = fs.create(file2);
    for (int i = 0; i < 2; i++) {
      long count = 0;
      while (count < 1048576) {
        out2.writeBytes("hell");
        count += 4;
      }
    }
    ((DFSOutputStream) out2.getWrappedStream()).hsync(EnumSet
        .of(SyncFlag.UPDATE_LENGTH));
    DFSTestUtil.abortStream((DFSOutputStream) out2.getWrappedStream());

    fs.createSnapshot(path, "s1");
    // delete parent directory
    fs.delete(new Path("/test/test"), true);
    cluster.restartNameNode();
  }
{code}

I am not sure if it's a test case issue, or something to do with snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to