[ 
https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745804#comment-13745804
 ] 

Vinay commented on HDFS-4504:
-----------------------------

Please check this test
{code:java}  @Test
  public void testPipelineFailureWithZombie() throws Exception {
    Configuration conf = new Configuration();
    conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, BLOCK_SIZE);
    conf.setInt(DFSConfigKeys.DFS_CLIENT_CLOSE_TIMEOUT_MS, 5000);
    MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf).numDataNodes(1)
        .build();
    DistributedFileSystem fs = cluster.getFileSystem();
    FSDataOutputStream fos = fs.create(new Path("/test"));
    boolean closed = false;
    DataNodeProperties dn = null;
    try {
      fos.writeBytes("Hello");
      fos.hflush();
      dn = cluster.stopDataNode(0);
      fos.writeBytes("Hello again");
      fos.close();
      closed=true;
    }catch(Exception e){
      // Ignore as of now
    }
    finally {
      try {
        fos.close();
        closed=true;
      } catch (IOException e) {
        // Ignore as close will not be able to complete
      }
    }
    if (!closed) {
      // just to check the activity by ZombieStreamManager
      Thread.sleep(10000);
      cluster.restartDataNode(dn, true);
      Thread.sleep(Long.MAX_VALUE);
    }
  }{code}

In this case, Since the streamer is not made null, then complete call is called 
with lastBlock, which has changed the state of the Block to COMMITTED and 
expects minReplication.
So this will never be satisfied, and also force complete and recoverLease also 
fails saying minReplication for COMMITTED blocks not met.
                
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>
>                 Key: HDFS-4504
>                 URL: https://issues.apache.org/jira/browse/HDFS-4504
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, 
> HDFS-4504.007.patch, HDFS-4504.008.patch, HDFS-4504.009.patch, 
> HDFS-4504.010.patch, HDFS-4504.011.patch, HDFS-4504.014.patch, 
> HDFS-4504.015.patch, HDFS-4504.016.patch
>
>
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases.  One 
> example is if there is a pipeline error and then pipeline recovery fails.  
> Unfortunately, in this case, some of the resources used by the 
> {{DFSOutputStream}} are leaked.  One particularly important resource is file 
> leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many 
> blocks to a file, but then fail to close it.  Unfortunately, the 
> {{LeaseRenewerThread}} inside the client will continue to renew the lease for 
> the "undead" file.  Future attempts to close the file will just rethrow the 
> previous exception, and no progress can be made by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to