[ http://issues.apache.org/jira/browse/HADOOP-656?page=comments#action_12447471 ] dhruba borthakur commented on HADOOP-656: -----------------------------------------
I can reliably cause crc-corruption in the case when the lease timesout. The following scenario explains this: The client renews his lease every 30 seconds. The namenode declares a client as 'dead' if it does not get a lease-renewal message in 60 seconds. The namenode then reclaims the datablocks for that file; these datablocks may now get allocated from another file. If it so happens that a client gets delayed for more than 60 seconds in its lease renewal (due to network congestion, slow response from datanodes, etc. etc), then the namenode will experience a lease expiration and will reclaim the blocks for that file in question. The namenode may now allocate these blocks to a new file. This new file may start writing to this block. Meanwhile the original file-writer may continue to flush his data to the same block because it has not yet experienced a lease-timeout-exception. This may lead to data corruption. Simulating the lease-expiration timeouts to occur immediately causes crc corruptions to show up. > dfs locking doesn't notify the application when a lock is lost > -------------------------------------------------------------- > > Key: HADOOP-656 > URL: http://issues.apache.org/jira/browse/HADOOP-656 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.7.2 > Reporter: Owen O'Malley > Assigned To: Sameer Paranjpye > > DFS locks may be lost for failing to renew the lease on time, but the > application is not notified about the loss of the lock and may therefore > perform operations assuming it has the lock, even though the lock has been > given to another process. I propose that DFS operations check to see if that > client has lost a lock since the last check and if so throw a > LostLockException. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira