[ 
https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-14914:
------------------------------
    Description: 
When observer is in safemode, calling getBlockLocations will make it throw 
RetriableException as in HDFS-13898. However, during startup the safemode is 
taking a really long time and retry would not help much here.

The change is to make it failover so that Router can immediately try another 
Observer or Active namenode (depends on the design). The current 
ObserverReadProxyProvider doesn't get affected since both RetriableException 
and StandbyException will make it failover.

  was:
When observer is in safemode, calling getBlockLocations will make it throw 
RetriableException as in HDFS-13898. However, during startup the safemode is 
taking a really long time and retry would not help much here.

What makes it worse is that when Routers talking to Observers, since Router 
distinguishes StandbyException and RetriableException, it will keep retry 
(default 3) times and then return to the client an RetriableException.

The change is to make it failover so that Router can immediately try another 
Observer or Active namenode (depends on the design). The current 
ObserverReadProxyProvider doesn't get affected since both RetriableException 
and StandbyException will make it failover.


> Observer should throw StandbyException in Safemode
> --------------------------------------------------
>
>                 Key: HDFS-14914
>                 URL: https://issues.apache.org/jira/browse/HDFS-14914
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Fengnan Li
>            Assignee: Fengnan Li
>            Priority: Minor
>         Attachments: HDFS-14914-001.patch
>
>
> When observer is in safemode, calling getBlockLocations will make it throw 
> RetriableException as in HDFS-13898. However, during startup the safemode is 
> taking a really long time and retry would not help much here.
> The change is to make it failover so that Router can immediately try another 
> Observer or Active namenode (depends on the design). The current 
> ObserverReadProxyProvider doesn't get affected since both RetriableException 
> and StandbyException will make it failover.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to