[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142992#comment-14142992 ] sam liu commented on HDFS-5526: --- I also encountered similar error on the secondary namenode and opened another jira HDFS-7114 > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 0.23.10, 2.3.0 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943409#comment-13943409 ] Tsz Wo Nicholas Sze commented on HDFS-5526: --- Hi [~kihwal], this did not completely solve the problem. If a user upgrade from an old version without this patch, say hadoop-2.0.5-alpha, to trunk, then the datanode cannot rollback. The old version will compare layout version and then throw IncorrectVersionException. Could you take a look HDFS-6137? > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 0.23.10, 2.3.0 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832574#comment-13832574 ] Hudson commented on HDFS-5526: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1620 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1620/]) HDFS-5526. Datanode cannot roll back to previous layout version. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1545322) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832563#comment-13832563 ] Hudson commented on HDFS-5526: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1594 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1594/]) HDFS-5526. Datanode cannot roll back to previous layout version. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1545322) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832482#comment-13832482 ] Hudson commented on HDFS-5526: -- FAILURE: Integrated in Hadoop-Yarn-trunk #403 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/403/]) HDFS-5526. Datanode cannot roll back to previous layout version. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1545322) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832224#comment-13832224 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- Kihwal, I tested this with HDFS-2832 after the commit. It worked well. Thanks! > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831570#comment-13831570 ] Hudson commented on HDFS-5526: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4789 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4789/]) HDFS-5526. Datanode cannot roll back to previous layout version. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1545322) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831565#comment-13831565 ] Kihwal Lee commented on HDFS-5526: -- I made a typo in the checkin commen, so commit messages are not showing up here. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831562#comment-13831562 ] Kihwal Lee commented on HDFS-5526: -- [~szetszwo], I've committed this to branch-2, branch-0.23 and trunk. Please pull into other branches as see fit and resolve this jira. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Fix For: 3.0.0, 2.3.0, 0.23.10 > > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828373#comment-13828373 ] Vinay commented on HDFS-5526: - +1 patch looks pretty good. Thanks Kihwal > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828341#comment-13828341 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- +1 the patch works as expected. Thanks a lot, Kihwal! > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828331#comment-13828331 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- Okay, let me test it. Will post the results. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828262#comment-13828262 ] Hadoop QA commented on HDFS-5526: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614985/HDFS-5526.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5508//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5508//console This message is automatically generated. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828239#comment-13828239 ] Kihwal Lee commented on HDFS-5526: -- bq. If a new field is added to the ./current/VERSION file in a future release (e.g. HDFS-2832 adds DatanodeUuid), then rollback will preserve it in the VERSION file. - When rolling back from post-HDFS-2832 to pre-HDFS-2832, the new field will disappear. This is because the old datanode code won't load or write the extra field. - When rolling back from post-HDFS-2832 to another post-HDFS-2832 version, the added field will be preserved, assuming {{setPropertiesFromFields()}} and {{setFieldsFromProperties()}} are properly modified to support the new field. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828224#comment-13828224 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- If I understand the patch correctly, it regenerates the ./current/VERSION file by first reading it from the storage and then overwrite layoutVersion. If a new field is added to the ./current/VERSION file in a future release (e.g. HDFS-2832 adds DatanodeUuid), then rollback will preserve it in the VERSION file. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch, HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827872#comment-13827872 ] Vinay commented on HDFS-5526: - Thanks for the detailed explanation Kihwal. It make sense that ctime has nothing to do with DataStorage for Post-federation versions. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827810#comment-13827810 ] Kihwal Lee commented on HDFS-5526: -- I think we need to clarify the role of the VERSION files. I think layout version and ctime checks are also in the block pool slice level, which is represented by {{BlockPoolSliceStorage}}. So, I think not all fields in the VERSION file written by {{DataStorage}} - the volume-level storage - is useful. VERSION properties in {{DataStorage}} : /current/VERSION - layoutVersion - storageType - namespaceID - clusterID - cTime - storageID VERSION properties in {{BlockPoolSliceStorage}} : /current//current/VERSION - layoutVersion - namespaceID - blockpoolID - cTime For {{DataStorage}}, the critical information to maintain during post-federation upgrade/rollback are - namespaceID (not used post federation) - storageID - storageType - automatically set - clusterID {{cTime}} at {{DataStorage}} level doesn't seem to make sense. It will be compared against the one in {{nsInfo}} from the name node for the first block pool that is initialized. If the initialization order changes, {{DataStorage}} may fail to initialize. I don't know whether it's by design or not, but as you (Vinay) said, {{DataStorage.upgrade}} will always run during DN startup after the first upgrade. This prevents the initialization failure during normal start-up and upgrade. Rollback is different since it doesn't go through this code path. Since upgrade in {{DataStorage}} level does not involve actual data, {{cTime}} change within the same layout version has no meaning and it shouldn't make any changes. I propose removing cTime check for post-federation upgrades. Validation should be only performed against {{clusterID}} and {{layoutVersion}}. Upgrade action (post-federation, not 1.x to 2.x upgrades) be taken only when the layout version is changed. For post-federation rollback, it should only update {layoutVersion}. There is no need to save the old layout version. The layout version is invalid (i.e. talked to a NN running wrong version), the rollback at {{BlockPoolSliceStorage}} will fail. If the error is corrected and the datanode is restarted with "-rollback", it should not be stuck in an invalid state. It means, the rollback at {{DataStorage}} level should accept whatever the first name node is saying. Again, this is safe since correctness is checked in {{BlockPoolSliceStorage}} level. I will submit an update patch soon. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827256#comment-13827256 ] Vinay commented on HDFS-5526: - Yes you are right nicholas. We dont have any clue whether we have upgraded or not. One thing is Version file will be overwritten if the layoutVersion is latest or ctime is latest., But that namenode also should be part of the same cluster otherwise upgrade will not happen. One more thing, ./current/VERSION will be overwritten everytime DN restarted after upgrade because of the following check. {code}// do upgrade if (this.layoutVersion > HdfsConstants.LAYOUT_VERSION || this.cTime < nsInfo.getCTime()) { doUpgrade(sd, nsInfo); // upgrade return; }{code} because, ./current/VERSION file ctime is always same. and upgraded NN will have higher ctime. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827242#comment-13827242 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- > But during upgrade only clusterId and layoutVersion are overwritten, ctime is > never modified. clusterId and layoutVersion are never going to change > dynamically. right? You are right but we also need to consider some error cases such as connecting a DN to a wrong cluster, moving the storage to another DN, rolling back to a wrong version, upgrading again without rollback, etc. We need to make sure all the error cases will fail. I think it is the hard part. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827227#comment-13827227 ] Vinay commented on HDFS-5526: - bq. For example, the ctime or some ids may have been changed in some unexpected way without being noticed Overwriting the version file in datanode current directory is only during format and upgrade. But during upgrade only clusterId and layoutVersion are overwritten, ctime is never modified. clusterId and layoutVersion are never going to change dynamically. right? {noformat}if (LayoutVersion.supports(Feature.FEDERATION, layoutVersion)) { clusterID = nsInfo.getClusterID(); layoutVersion = nsInfo.getLayoutVersion(); writeProperties(sd); return; }{noformat} Hi [~kihwal], Patch looks really simple. +1 do you think we need to update this comment now..? {code} * Do nothing, if previous directory does not exist.{code} > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827156#comment-13827156 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- I like the simplicity of the patch -- it only changes the rollback code but not upgrade. However, the VERSION file is overwritten but not restored during rollback. I worry if it is possible that the new VERSION file is different from the original VERSION file. For example, the ctime or some ids may have been changed in some unexpected way without being noticed. How can we make sure the new and the original VERSION files are the same? > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827072#comment-13827072 ] Hadoop QA commented on HDFS-5526: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614686/HDFS-5526.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5491//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5491//console This message is automatically generated. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827055#comment-13827055 ] Kihwal Lee commented on HDFS-5526: -- bq. Would storageID and cTime be preserved? I think so. The slight difficulty is at loading current/VERSION without blowing up. After reading in, it needs to override a couple of fields and call writeProperties(). bq. BTW, do you know why cTime=0 in my test case above? DataStorage's cTime is set to 0 when the node is formatted, but that of BlockPoolSliceStorage is supposed to be set to the one from nsInfo. So my guess is, when NNStorage is formatted cTime is 0. NNStorage.newNamespaceInfo() is setting it to 0 and this must be used for formatting. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826977#comment-13826977 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- Would storageID and cTime be preserved? BTW, do you know why cTime=0 in my test case above? > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Kihwal Lee >Priority: Blocker > Attachments: HDFS-5526.patch > > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826825#comment-13826825 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- Kihwal, you are right that there is no backup copy of ./current/VERSION and so it cannot be rolled back. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Priority: Blocker > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826821#comment-13826821 ] Kihwal Lee commented on HDFS-5526: -- What about adding the following in the beginning of {{DataStorage.doRollback()}}? This is similar to what is in {{DataStorage.doUpgrade()}}. Since the VERSION file is not read yet in rollback (if it does it blows up as you reported), {{this.layoutVersion}} will be 0. So instead, it is checking the software layout version and see whether it is using the same layout version as the name node. {code} if (LayoutVersion.supports(Feature.FEDERATION, HdfsConstants.LAYOUT_VERSION) && HdfsConstants.LAYOUT_VERSION == nsInfo.getLayoutVersion()) { clusterID = nsInfo.getClusterID(); layoutVersion = nsInfo.getLayoutVersion(); writeProperties(sd); return; } {code} > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Priority: Blocker > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826661#comment-13826661 ] Kihwal Lee commented on HDFS-5526: -- Well, it is being called for {{DataStorage}} when {{recoverTransitionRead()}} is called for the first block pool that is being initialized. {{DataStorage.doUpgrade()}} will simply write the new version and return. So "previous" won't be created. In {{DataStorage.doRollback()}}, the comment says, "Do nothing, if previous directory does not exist" and that's what it does. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Priority: Blocker > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826631#comment-13826631 ] Kihwal Lee commented on HDFS-5526: -- It looks like {{recoverTransitionRead()}} is done for individual block pool storage from {{DataNode.initStorage()}}, but never done for {{DataStorage}} itself. That also explains why it's lacking "previous" after upgrade. > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Priority: Blocker > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5526) Datanode cannot roll back to previous layout version
[ https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826113#comment-13826113 ] Tsz Wo (Nicholas), SZE commented on HDFS-5526: -- The problem is that ./current/VERSION is not rolled back. Below is a test case: # format using v2.2.0 #* ./current/VERSION {noformat} #Mon Nov 18 18:26:32 PST 2013 storageID=DS-955224766-xx.xx.xx.xx-50010-1384827992929 clusterID=CID-7692ad34-b0d1-4e81-a08e-c714010a8355 cTime=0 storageType=DATA_NODE layoutVersion=-47 {noformat} #* ./previous {noformat} {noformat} #* ./current/BP-1970701609-xx.xx.xx.xx-1384827972355/current/VERSION {noformat} #Mon Nov 18 18:26:33 PST 2013 namespaceID=737643680 cTime=0 blockpoolID=BP-1970701609-xx.xx.xx.xx-1384827972355 layoutVersion=-47 {noformat} #* ./current/BP-1970701609-xx.xx.xx.xx-1384827972355/previous {noformat} {noformat} # upgrade to trunk with layout version -48 #* ./current/VERSION {noformat} #Mon Nov 18 18:34:30 PST 2013 storageID=DS-955224766-xx.xx.xx.xx-50010-1384827992929 clusterID=CID-7692ad34-b0d1-4e81-a08e-c714010a8355 cTime=0 storageType=DATA_NODE layoutVersion=-48 {noformat} #* ./previous {noformat} {noformat} #* ./current/BP-1970701609-xx.xx.xx.xx-1384827972355/current/VERSION {noformat} #Mon Nov 18 18:34:30 PST 2013 namespaceID=737643680 cTime=1384828466206 blockpoolID=BP-1970701609-xx.xx.xx.xx-1384827972355 layoutVersion=-48 {noformat} #* ./current/BP-1970701609-xx.xx.xx.xx-1384827972355/previous/VERSION {noformat} #Mon Nov 18 18:26:33 PST 2013 namespaceID=737643680 cTime=0 blockpoolID=BP-1970701609-xx.xx.xx.xx-1384827972355 layoutVersion=-47 {noformat} # rollback to -47 will fail with IncorrectVersionException {code} 2013-11-18 18:41:56,420 FATAL datanode.DataNode (BPServiceActor.java:run(706)) - Initialization failed for block pool Block pool BP-1970701609-xx.xx.xx.xx-1384827972355 (storage id ) service to localhost/xx.xx.xx.xx:8020 org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /Users/szetszwo/hadoop/_local_cluster/data1. Reported: -48. Expecting = -47. {code} > Datanode cannot roll back to previous layout version > > > Key: HDFS-5526 > URL: https://issues.apache.org/jira/browse/HDFS-5526 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Tsz Wo (Nicholas), SZE >Priority: Blocker > > Current trunk layout version is -48. > Hadoop v2.2.0 layout version is -47. > If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes > cannot start with -rollback. It will fail with IncorrectVersionException. -- This message was sent by Atlassian JIRA (v6.1#6144)