[ 
https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739168#comment-13739168
 ] 

Todd Lipcon commented on HDFS-4504:
-----------------------------------

I don't think {{recoverLease}} is the right API here.. here's an example where 
it could cause problems:

- Process A is writing /file, and loses its network connection right before 
calling close(). Thus it gets registered as a zombie.
- Process B calls append() on the file after the soft lease has expired. This 
allows B to keep appending where A left off.
- Process A recovers its network. The recoverLease() call will then kick 
process B out from writing.

Given that these RPCs are also pathname-based, it could even kick a writer off 
of a new file that just happened to share the file path.

It seems to me like it would be better to call completeFile() or perhaps some 
new abortFile() RPC, which would first verify that the client name trying to 
abort the lease is the same as the current lease holder.
                
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>
>                 Key: HDFS-4504
>                 URL: https://issues.apache.org/jira/browse/HDFS-4504
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch, 
> HDFS-4504.007.patch, HDFS-4504.008.patch, HDFS-4504.009.patch, 
> HDFS-4504.010.patch, HDFS-4504.011.patch
>
>
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases.  One 
> example is if there is a pipeline error and then pipeline recovery fails.  
> Unfortunately, in this case, some of the resources used by the 
> {{DFSOutputStream}} are leaked.  One particularly important resource is file 
> leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many 
> blocks to a file, but then fail to close it.  Unfortunately, the 
> {{LeaseRenewerThread}} inside the client will continue to renew the lease for 
> the "undead" file.  Future attempts to close the file will just rethrow the 
> previous exception, and no progress can be made by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to