[ https://issues.apache.org/jira/browse/HDFS-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901129#action_12901129 ]
sam rash commented on HDFS-1350: -------------------------------- My understanding of how lease recovery works in 20-append is that on cluster restart, an open file will be recovered by the Namenode. Datanodes will send the longest valid length of the block (ie, if there are 8 bytes of checksum and 1500 data, the valid length is 1024 assuming 512 byte chunk size). The block is then truncated to a valid length. 20-append seems to have a bug that any block where data + checksum length don't match, the block isn't use in lease recovery. the work here might be to fix that? > make datanodes do graceful shutdown > ----------------------------------- > > Key: HDFS-1350 > URL: https://issues.apache.org/jira/browse/HDFS-1350 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Reporter: sam rash > Assignee: sam rash > > we found that the Datanode doesn't do a graceful shutdown and a block can be > corrupted (data + checksum amounts off) > we can make the DN do a graceful shutdown in case there are open files. if > this presents a problem to a timely shutdown, we can make a it a parameter of > how long to wait for the full graceful shutdown before just exiting -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.