[ https://issues.apache.org/jira/browse/HDFS-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703024#comment-13703024 ]
Fengdong Yu commented on HDFS-4967: ----------------------------------- Hi Arpit, I am already have a HA cluster(LayoutVersion=-43), and I don't want to upgrade, because upgrade under HA is complicated. AFAIK, upgrade under HA cluster including: 1) stop-dfs 2) replace core-site.xml hdfs-site.xml with Non-HA xmls. 3) merge edit logs from standby NN to active NN 4) start-dfs -upgrade 5) stop-dfs 6) revert core-site.xml and hdfs-site.xml(HA) 7) initializeSharedEdits 8) start-dfs So, I did some hack in the trunk codes, like following: {code} - OPTIMIZE_SNAPSHOT_INODES(-45, -43, + OPTIMIZE_SNAPSHOT_INODES(-43, -43, "Reduce snapshot inode memory footprint", false), - SEQUENTIAL_BLOCK_ID(-46, "Allocate block IDs sequentially and store " + - "block IDs in the edits log and image files"); + SEQUENTIAL_BLOCK_ID(-43, -43, "Allocate block IDs sequentially and store " + + "block IDs in the edits log and image files", false); {code} after that, I saw the above errors. but now, I revert LayoutVersion.java, and upgraded cluster(eight steps I mentions at first) It works, no Exceptions. So we can close this issue with "not a problem". but I want to know: Am I right for these eight steps during upgrade?(from one HA verson to another HA version, such as from LayoutVersion -43 to -46) > Generate block ID sequentially cannot work with QJM HA > ------------------------------------------------------ > > Key: HDFS-4967 > URL: https://issues.apache.org/jira/browse/HDFS-4967 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, hdfs-client, namenode > Affects Versions: 3.0.0 > Reporter: Fengdong Yu > Assignee: Arpit Agarwal > > There are two name nodes, one is active, another acts as standby name node. > QJM Ha configured. > After HDFS-4645 committed in the trunk, then the following error showed > during name node start: > {code} > 2013-07-09 11:28:45,394 FATAL > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join > java.lang.IllegalStateException: Cannot skip to less than the current value > (=1073741824), where newValue=0 > at > org.apache.hadoop.util.SequentialNumber.skipTo(SequentialNumber.java:58) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setLastAllocatedBlockId(FSNamesystem.java:5124) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Loader.load(FSImageFormat.java:278) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:809) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:798) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:653) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:623) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:260) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:719) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:552) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:401) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:435) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:607) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:592) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1172) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1238) > 2013-07-09 11:28:45,397 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1 > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira