[ https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955301#comment-16955301 ]
Fengnan Li commented on HDFS-14914: ----------------------------------- [~chaosun] [~xkrogen] [~inigoiri] Please take a look and let me know your thoughts. > Observer should throw StandbyException in Safemode > -------------------------------------------------- > > Key: HDFS-14914 > URL: https://issues.apache.org/jira/browse/HDFS-14914 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Fengnan Li > Assignee: Fengnan Li > Priority: Minor > Attachments: HDFS-14914-001.patch > > > When observer is in safemode, calling getBlockLocations will make it throw > RetriableException as in > [HDFS-13898|https://issues.apache.org/jira/browse/HDFS-13898]. However, > during startup the safemode is taking a really long time and retry would not > help much here. > What makes it worse is that when Routers talking to Observers, since Router > distinguishes StandbyException and RetriableException, it will keep retry > (default 3) times and then return to the client an RetriableException. The > client will retry again on the same Router and to the same Observer for > default 10 times, resulting in 3 * 10 = 30 retries per call. > The change is to make it failover so that Router can immediately try another > Observer or Active namenode (depends on the design). The current > ObserverReadProxyProvider doesn't get affected since both RetriableException > and StandbyException will make it failover. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org