[ 
https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738134#comment-13738134
 ] 

Vinay commented on HDFS-4504:
-----------------------------

{quote}In some cases, DFSOutputStream#close and DFSOutputStream#lastException 
will be set by the DataStreamer, prior to DFSOutputStream#close being called. 
In those cases, we need to throw an exception from close prior to clearing the 
exception.{quote}
I assume these cases were never handled. Without handling pipeline failure 
cases, this patch will be incomplete.
Pipeline failures while writing data are also most likely to happen.

In case of pipeline failures {{closed}} will be marked {{true}} by DataStreamer 
thread itself (as mentioned already in [~cmccabe] comment). On first call to 
close() will throw the pipeline failure exception, but next calls to close() 
just returns. *So Stream will never be marked as zombie, also resources will 
never be released.*

You can verify by changing your test {{testCloseWithDatanodeDown}} as follows
{code}+      out.write(100);
+      cluster.stopDataNode(0);{code}

to 
{code}+      out.write(100);
+      out.hflush();
+      out.write(100);
+      cluster.stopDataNode(0);{code}

Please check. 

                
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>
>                 Key: HDFS-4504
>                 URL: https://issues.apache.org/jira/browse/HDFS-4504
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, 
> HDFS-4504.007.patch, HDFS-4504.008.patch, HDFS-4504.009.patch, 
> HDFS-4504.010.patch
>
>
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases.  One 
> example is if there is a pipeline error and then pipeline recovery fails.  
> Unfortunately, in this case, some of the resources used by the 
> {{DFSOutputStream}} are leaked.  One particularly important resource is file 
> leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many 
> blocks to a file, but then fail to close it.  Unfortunately, the 
> {{LeaseRenewerThread}} inside the client will continue to renew the lease for 
> the "undead" file.  Future attempts to close the file will just rethrow the 
> previous exception, and no progress can be made by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to