[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435657#comment-13435657 ]
Colin Patrick McCabe commented on HDFS-3731: -------------------------------------------- I ran another experiment where I turned down the hard lease recovery time by a little bit, and then upgraded a cluster with non-empty blocksBeingWritten directory from 1.x to 2.x using this patch. I confirmed that LeaseManager#Monitor tracked the files from 1.x that I had left unrecovered. They were eventually recovered after a period of time, as you can see from these logs: {code} 2012-08-15 17:25:37,029 INFO hdfs.StateChange (BlockInfoUnderConstruction.java:initializeBlockRecovery(248)) - BLOCK* blk_4712799930147027732_1021{blockUCState=UNDER_RECOVERY, primaryNodeIndex=0, replicas=[ReplicaUnderConstruction[127.0.0.1:48054|RWR]]} recovery started, primary=127.0.0.1: 48054 2012-08-15 17:25:37,029 WARN hdfs.StateChange (FSNamesystem.java:internalReleaseLease(3061)) - DIR* NameSystem.internalReleaseLease: File /top-dir-1Mb-512/directory1/file- with-no-crc has not been closed. Lease recovery is in progress. RecoveryId = 1043 for block blk_4712799930147027732_1021{blockUCState=UNDER_RECOVERY, primaryNodeIndex=0, replicas=[ReplicaUnderConstruction[127.0.0.1:48054|RWR]]} 2012-08-15 17:25:37,030 INFO namenode.LeaseManager (LeaseManager.java:checkLeases(444)) - Started block recovery for file /top-dir-1Mb-512/directory1/file-with-no-crc lease [Lease. Holder: DFSClient_8256078, pendingcreates: 0] 2012-08-15 17:25:37,625 INFO datanode.DataNode (DataNode.java:logRecoverBlock(2016)) - NameNode at keter/127.0.0.1:50472 calls recoverBlock(block=BP-2033981131-127.0.0.1- 1345076136282:blk_7162739548153522810_1020, targets=[127.0.0.1:48054], newGenerationStamp=1031) 2012-08-15 17:25:37,626 INFO impl.FsDatasetImpl (FsDatasetImpl.java:initReplicaRecovery(1416)) - initReplicaRecovery: block=blk_7162739548153522810_1020, recoveryId=1031, replica=ReplicaWaitingToBeRecovered, blk_7162739548153522810_1020, RWR getNumBytes() = 1024 getBytesOnDisk() = 1024 getVisibleLength()= -1 getVolume() = /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current getBlockFile() = /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current/BP-2033981131-127.0.0.1-1345076136282/current/rbw/ blk_7162739548153522810 unlinked=false 2012-08-15 17:25:37,626 INFO impl.FsDatasetImpl (FsDatasetImpl.java:initReplicaRecovery(1471)) - initReplicaRecovery: changing replica state for blk_7162739548153522810_1020 from RWR to RUR 2012-08-15 17:25:37,627 INFO impl.FsDatasetImpl (FsDatasetImpl.java:updateReplicaUnderRecovery(1486)) - updateReplica: block=BP-2033981131-127.0.0.1-1345076136282: blk_7162739548153522810_1020, recoveryId=1031, length=1024, replica=ReplicaUnderRecovery, blk_7162739548153522810_1020, RUR getNumBytes() = 1024 getBytesOnDisk() = 1024 getVisibleLength()= -1 getVolume() = /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current getBlockFile() = /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current/BP-2033981131-127.0.0.1-1345076136282/current/rbw/ blk_7162739548153522810 recoveryId=1031 original=ReplicaWaitingToBeRecovered, blk_7162739548153522810_1020, RWR getNumBytes() = 1024 getBytesOnDisk() = 1024 getVisibleLength()= -1 getVolume() = /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current getBlockFile() = /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/current/BP-2033981131-127.0.0.1-1345076136282/current/rbw/ blk_7162739548153522810 2012-08-15 17:25:37,629 INFO namenode.FSNamesystem (FSNamesystem.java:commitBlockSynchronization(3140)) - commitBlockSynchronization(lastblock=BP-2033981131-127.0.0.1- 1345076136282:blk_7162739548153522810_1020, newgenerationstamp=1031, newlength=1024, newtargets=[127.0.0.1:48054], closeFile=true, deleteBlock=false) 2012-08-15 17:25:37,631 INFO namenode.FSNamesystem (FSNamesystem.java:commitBlockSynchronization(3217)) - commitBlockSynchronization(newblock=BP-2033981131-127.0.0.1- 1345076136282:blk_7162739548153522810_1020, file=/1kb-multiple-checksum-blocks-64-16, newgenerationstamp=1031, newlength=1024, newtargets=[127.0.0.1:48054]) successful {code} So as far as I can see: 1. after the upgrade, the fsimage contains leases which are loaded into memory in the upgraded NN 2. the leases are recovered after a given amount of time by moving the blocks files out of {{rbw/}} and into {{finalized/}} So basically, as far as I can see, the recovery mechanism for bbw blocks in 1.x and rbw in 2.x is compatible. The LeaseManager code also seems fairly similar too. However, I welcome additional commentary on this, since I am not greatly familiar with this area of the code. Is there anything else we should verify here? I also feel like a nice enhancement might be immediately recovering all leases on an upgrade from branch-1. It doesn't make sense to keep leases around unless you think a client can "come back"-- however, branch-1 clients cannot interact with branch-2 NameNodes. (correct me if I'm wrong here?) However, I'm not sure if it's worth investing the time and adding the code complexity to do that. This is already something of a corner case. We may also want to have upgrades in the future where old clients continue to function. > 2.0 release upgrade must handle blocks being written from 1.0 > ------------------------------------------------------------- > > Key: HDFS-3731 > URL: https://issues.apache.org/jira/browse/HDFS-3731 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 2.0.0-alpha > Reporter: Suresh Srinivas > Assignee: Colin Patrick McCabe > Priority: Blocker > Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch > > > Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 > release. Problem reported by Brahma Reddy. > The {{DataNode}} will only have one block pool after upgrading from a 1.x > release. (This is because in the 1.x releases, there were no block pools-- > or equivalently, everything was in the same block pool). During the upgrade, > we should hardlink the block files from the {{blocksBeingWritten}} directory > into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}}, > we should delete the {{blocksBeingWritten}} directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira