[ https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777427#comment-16777427 ]
Xiao Liang commented on HDFS-14201: ----------------------------------- Thanks [~hexiaoqiao] for pointing out the manual transition, it's a valid point. Actually I think combining the logic in [^HDFS-14201.002.patch] and [^HDFS-14201.003.patch] could be an option, so that when the switch for this feature is on: # in auto-failover mode, ZKFC choose a ready-to-serve NameNode to become active, as those in safemode ones report UNHEALTHY; # in manual mode, NameNode in safemode will not be able to transit to active; The same configuration item would be controlling these logic to be on/off. How do you think [~hexiaoqiao]? I would upload a new patch as proposed if you think it's a reasonable option. > Ability to disallow safemode NN to become active > ------------------------------------------------ > > Key: HDFS-14201 > URL: https://issues.apache.org/jira/browse/HDFS-14201 > Project: Hadoop HDFS > Issue Type: Improvement > Components: auto-failover > Affects Versions: 3.1.1, 2.9.2 > Reporter: Xiao Liang > Assignee: Xiao Liang > Priority: Major > Attachments: HDFS-14201.001.patch, HDFS-14201.002.patch, > HDFS-14201.003.patch > > > Currently with HA, Namenode in safemode can be possibly selected as active, > for availability of both read and write, Namenodes not in safemode are better > choices to become active though. > It can take tens of minutes for a cold started Namenode to get out of > safemode, especially when there are large number of files and blocks in HDFS, > that means if a Namenode in safemode become active, the cluster will be not > fully functioning for quite a while, even if it can while there is some > Namenode not in safemode. > The proposal here is to add an option, to allow Namenode to report itself as > UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning > Namenode to become active, improving the general availability of the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org