[ https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826619#comment-15826619 ]
Arpit Agarwal commented on HDFS-11209: -------------------------------------- +1 for the v04 patch assuming the test failures are unrelated. One minor point - the isRollingUpgrade RPC need not check for super user privilege since it's harmless. Doesn't affect correctness though since the SNN would be running as the hdfs superuser. > SNN can't checkpoint when rolling upgrade is not finalized > ---------------------------------------------------------- > > Key: HDFS-11209 > URL: https://issues.apache.org/jira/browse/HDFS-11209 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades > Affects Versions: 2.8.0, 3.0.0-alpha1 > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Priority: Critical > Attachments: HDFS-11209.00.patch, HDFS-11209.01.patch, > HDFS-11209.02.patch, HDFS-11209.03.patch, HDFS-11209.04.patch > > > Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 > brings this back. > With HDFS-8432, the primary NN will not update the VERSION file to the new > version after running with "rollingUpgrade" option until upgrade is > finalized. This is to support more downgrade use cases. > However, the checkpoint on the SNN is incorrectly updating the VERSION file > when the rollingUpgrade is not finalized yet on the primary NN. As a result, > the SNN checkpoint successfully but fail to push it to the primary NN because > its version is higher than the primary NN as shown below. > {code} > 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode > (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint > org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: > Image uploading failed, status: 403, url: > http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., > message: This namenode has storage info -60:221856466:1444080250181:clusterX > but the secondary expected -63:221856466:1444080250181:clusterX > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org