[ 
https://issues.apache.org/jira/browse/HDFS-8127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8127:
----------------------------
    Attachment: HDFS-8127.000.patch

Upload a patch to fix the issue. Instead of adding -upgrade option, the patch 
lets the SBN directly learn if ANN is in upgrade state through a RPC call. Then 
if the ANN is in upgrade state, the SBN tries to save its original state into 
the previous directory. If its original state is corrupted and cannot be 
recovered, we prompt the user to format the SBN first: since we still use 
bootstrapstandby for HA rollback, it should be ok to have an old state 
generated by the new software.

> NameNode Failover during HA upgrade can cause DataNode to finalize upgrade
> --------------------------------------------------------------------------
>
>                 Key: HDFS-8127
>                 URL: https://issues.apache.org/jira/browse/HDFS-8127
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.4.0
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>            Priority: Blocker
>         Attachments: HDFS-8127.000.patch
>
>
> Currently for HA upgrade (enabled by HDFS-5138), we use {{-bootstrapStandby}} 
> to initialize the standby NameNode. The standby NameNode does not have the 
> {{previous}} directory thus it does not know that the cluster is in the 
> upgrade state. If NN failover happens, as response of block reports, the new 
> ANN will tell DNs to finalize the upgrade thus make it impossible to rollback 
> again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to