[ https://issues.apache.org/jira/browse/HDFS-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826922#comment-15826922 ]
Xiaoyu Yao commented on HDFS-11209: ----------------------------------- Thanks [~arpitagarwal] for the review. It is a good point that we may allow isRollingUpgrade() from non hdfs superuse even though the current usage is for hdfs superuser only with SNN. This applies to a similar API NameNodeRpcServer#isUpgradeFinalized(). I will open a separate ticket discussing whether we should remove the super user privilege check for both. > SNN can't checkpoint when rolling upgrade is not finalized > ---------------------------------------------------------- > > Key: HDFS-11209 > URL: https://issues.apache.org/jira/browse/HDFS-11209 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades > Affects Versions: 2.8.0, 3.0.0-alpha1 > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Priority: Critical > Attachments: HDFS-11209.00.patch, HDFS-11209.01.patch, > HDFS-11209.02.patch, HDFS-11209.03.patch, HDFS-11209.04.patch > > > Similar problem has been fixed with HDFS-7185. Recent change in HDFS-8432 > brings this back. > With HDFS-8432, the primary NN will not update the VERSION file to the new > version after running with "rollingUpgrade" option until upgrade is > finalized. This is to support more downgrade use cases. > However, the checkpoint on the SNN is incorrectly updating the VERSION file > when the rollingUpgrade is not finalized yet on the primary NN. As a result, > the SNN checkpoint successfully but fail to push it to the primary NN because > its version is higher than the primary NN as shown below. > {code} > 2016-12-02 05:25:31,918 ERROR namenode.SecondaryNameNode > (SecondaryNameNode.java:doWork(399)) - Exception in doCheckpoint > org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: > Image uploading failed, status: 403, url: > http://NN:50070/imagetransfer?txid=345404754&imageFile=IMAGE&File-Le..., > message: This namenode has storage info -60:221856466:1444080250181:clusterX > but the secondary expected -63:221856466:1444080250181:clusterX > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org