[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827810#comment-13827810 ]
Kihwal Lee edited comment on HDFS-5526 at 11/20/13 4:20 PM: ------------------------------------------------------------ I think we need to clarify the role of the VERSION files. Layout version and ctime checks are also in the block pool slice level, which is represented by {{BlockPoolSliceStorage}}. So, I think not all fields in the VERSION file written by {{DataStorage}} - the volume-level storage - is useful. VERSION properties in {{DataStorage}} : <volume>/current/VERSION - layoutVersion - storageType - namespaceID - clusterID - cTime - storageID VERSION properties in {{BlockPoolSliceStorage}} : <volume>/current/<blockpool>/current/VERSION - layoutVersion - namespaceID - blockpoolID - cTime For {{DataStorage}}, the critical information to maintain during post-federation upgrade/rollback are - namespaceID (not used post federation) - storageID - storageType - automatically set - clusterID {{cTime}} at {{DataStorage}} level doesn't seem to make sense. It will be compared against the one in {{nsInfo}} from the name node for the first block pool that is initialized. If the initialization order changes, {{DataStorage}} may fail to initialize. I don't know whether it's by design or not, but as you (Vinay) said, {{DataStorage.upgrade}} will always run during DN startup after the first upgrade. This prevents the initialization failure during normal start-up and upgrade. Rollback is different since it doesn't go through this code path. Since upgrade in {{DataStorage}} level does not involve actual data, {{cTime}} change within the same layout version has no meaning and it shouldn't make any changes. I propose removing cTime check for post-federation upgrades. Validation should be only performed against {{clusterID}} and {{layoutVersion}}. Upgrade action (post-federation, not 1.x to 2.x upgrades) be taken only when the layout version is changed. For post-federation rollback, it should only update {layoutVersion}. There is no need to save the old layout version. The layout version is invalid (i.e. talked to a NN running wrong version), the rollback at {{BlockPoolSliceStorage}} will fail. If the error is corrected and the datanode is restarted with "-rollback", it should not be stuck in an invalid state. It means, the rollback at {{DataStorage}} level should accept whatever the first name node is saying. Again, this is safe since correctness is checked in {{BlockPoolSliceStorage}} level. I will submit an update patch soon. was (Author: kihwal): I think we need to clarify the role of the VERSION files. I think layout version and ctime checks are also in the block pool slice level, which is represented by {{BlockPoolSliceStorage}}. So, I think not all fields in the VERSION file written by {{DataStorage}} - the volume-level storage - is useful. VERSION properties in {{DataStorage}} : <volume>/current/VERSION - layoutVersion - storageType - namespaceID - clusterID - cTime - storageID VERSION properties in {{BlockPoolSliceStorage}} : <volume>/current/<blockpool>/current/VERSION - layoutVersion - namespaceID - blockpoolID - cTime For {{DataStorage}}, the critical information to maintain during post-federation upgrade/rollback are - namespaceID (not used post federation) - storageID - storageType - automatically set - clusterID {{cTime}} at {{DataStorage}} level doesn't seem to make sense. It will be compared against the one in {{nsInfo}} from the name node for the first block pool that is initialized. If the initialization order changes, {{DataStorage}} may fail to initialize. I don't know whether it's by design or not, but as you (Vinay) said, {{DataStorage.upgrade}} will always run during DN startup after the first upgrade. This prevents the initialization failure during normal start-up and upgrade. Rollback is different since it doesn't go through this code path. Since upgrade in {{DataStorage}} level does not involve actual data, {{cTime}} change within the same layout version has no meaning and it shouldn't make any changes. I propose removing cTime check for post-federation upgrades. Validation should be only performed against {{clusterID}} and {{layoutVersion}}. Upgrade action (post-federation, not 1.x to 2.x upgrades) be taken only when the layout version is changed. For post-federation rollback, it should only update {layoutVersion}. There is no need to save the old layout version. The layout version is invalid (i.e. talked to a NN running wrong version), the rollback at {{BlockPoolSliceStorage}} will fail. If the error is corrected and the datanode is restarted with "-rollback", it should not be stuck in an invalid state. It means, the rollback at {{DataStorage}} level should accept whatever the first name node is saying. Again, this is safe since correctness is checked in {{BlockPoolSliceStorage}} level. I will submit an update patch soon. > Datanode cannot roll back to previous layout version > ---------------------------------------------------- > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Tsz Wo (Nicholas), SZE > Assignee: Kihwal Lee > Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)