[ https://issues.apache.org/jira/browse/HDFS-15738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Janus Chow updated HDFS-15738: ------------------------------ Attachment: HDFS-15738.001.patch Assignee: Janus Chow Status: Patch Available (was: Open) > Forbid the transition to Observer state when NameNode is in StartupSafeMode > --------------------------------------------------------------------------- > > Key: HDFS-15738 > URL: https://issues.apache.org/jira/browse/HDFS-15738 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Janus Chow > Assignee: Janus Chow > Priority: Major > Attachments: HDFS-15738.001.patch > > > Currently when a _getBlockLocation_ request comes to an Observer Namenode > which is in safemode, NameNode will have a check that if the result is empty, > it will reply to the client with a _RetriableException_, noting the client to > retry the request later. > And If the Observer Namenode is in startup safe mode, the client would have > to wait for the Observer NameNode to leave the safe mode. For a big cluster, > it may cause a long time of waiting for the client. In our cluster, we met > this problem, and the client needs to wait for about 30 minutes before the > service back to normal. > The reason for this situation is that the NameNode becomes the state of > Observer when it's still in safe mode getting Datanode's block reports. And > here are two solutions for this issue: > # Throw _ObserverRetryOnActiveException_ when the Observer NameNode is in > startup safe mode, redirecting the user's requests to active NN. > # Forbid the transition to Observer state when the cluster maintainer is > trying to do the transition operation. > We choose the second solution because the first one would abet the bad > operation of transition NN to Observers while it's not ready for real service. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org