[jira] [Updated] (HDFS-14914) Observer should throw StandbyException in Safemode
[ https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li updated HDFS-14914: -- Description: When observer is in safemode, calling getBlockLocations will make it throw RetriableException as in HDFS-13898. However, during startup the safemode is taking a really long time and retry would not help much here. The change is to make the exception to be failover so that client side can failover. Currently ObserverReadProxyProvider always failover regardless RetriableException or StandbyException so it doesn't get affected. was: When observer is in safemode, calling getBlockLocations will make it throw RetriableException as in HDFS-13898. However, during startup the safemode is taking a really long time and retry would not help much here. The change is to make it failover so that Router can immediately try another Observer or Active namenode (depends on the design). The current ObserverReadProxyProvider doesn't get affected since both RetriableException and StandbyException will make it failover. > Observer should throw StandbyException in Safemode > -- > > Key: HDFS-14914 > URL: https://issues.apache.org/jira/browse/HDFS-14914 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Attachments: HDFS-14914-001.patch > > > When observer is in safemode, calling getBlockLocations will make it throw > RetriableException as in HDFS-13898. However, during startup the safemode is > taking a really long time and retry would not help much here. > The change is to make the exception to be failover so that client side can > failover. Currently ObserverReadProxyProvider always failover regardless > RetriableException or StandbyException so it doesn't get affected. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14914) Observer should throw StandbyException in Safemode
[ https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li updated HDFS-14914: -- Description: When observer is in safemode, calling getBlockLocations will make it throw RetriableException as in HDFS-13898. However, during startup the safemode is taking a really long time and retry would not help much here. The change is to make it failover so that Router can immediately try another Observer or Active namenode (depends on the design). The current ObserverReadProxyProvider doesn't get affected since both RetriableException and StandbyException will make it failover. was: When observer is in safemode, calling getBlockLocations will make it throw RetriableException as in HDFS-13898. However, during startup the safemode is taking a really long time and retry would not help much here. What makes it worse is that when Routers talking to Observers, since Router distinguishes StandbyException and RetriableException, it will keep retry (default 3) times and then return to the client an RetriableException. The change is to make it failover so that Router can immediately try another Observer or Active namenode (depends on the design). The current ObserverReadProxyProvider doesn't get affected since both RetriableException and StandbyException will make it failover. > Observer should throw StandbyException in Safemode > -- > > Key: HDFS-14914 > URL: https://issues.apache.org/jira/browse/HDFS-14914 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Attachments: HDFS-14914-001.patch > > > When observer is in safemode, calling getBlockLocations will make it throw > RetriableException as in HDFS-13898. However, during startup the safemode is > taking a really long time and retry would not help much here. > The change is to make it failover so that Router can immediately try another > Observer or Active namenode (depends on the design). The current > ObserverReadProxyProvider doesn't get affected since both RetriableException > and StandbyException will make it failover. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14914) Observer should throw StandbyException in Safemode
[ https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li updated HDFS-14914: -- Description: When observer is in safemode, calling getBlockLocations will make it throw RetriableException as in HDFS-13898. However, during startup the safemode is taking a really long time and retry would not help much here. What makes it worse is that when Routers talking to Observers, since Router distinguishes StandbyException and RetriableException, it will keep retry (default 3) times and then return to the client an RetriableException. The change is to make it failover so that Router can immediately try another Observer or Active namenode (depends on the design). The current ObserverReadProxyProvider doesn't get affected since both RetriableException and StandbyException will make it failover. was: When observer is in safemode, calling getBlockLocations will make it throw RetriableException as in [HDFS-13898|https://issues.apache.org/jira/browse/HDFS-13898]. However, during startup the safemode is taking a really long time and retry would not help much here. What makes it worse is that when Routers talking to Observers, since Router distinguishes StandbyException and RetriableException, it will keep retry (default 3) times and then return to the client an RetriableException. The client will retry again on the same Router and to the same Observer for default 10 times, resulting in 3 * 10 = 30 retries per call. The change is to make it failover so that Router can immediately try another Observer or Active namenode (depends on the design). The current ObserverReadProxyProvider doesn't get affected since both RetriableException and StandbyException will make it failover. > Observer should throw StandbyException in Safemode > -- > > Key: HDFS-14914 > URL: https://issues.apache.org/jira/browse/HDFS-14914 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Attachments: HDFS-14914-001.patch > > > When observer is in safemode, calling getBlockLocations will make it throw > RetriableException as in HDFS-13898. However, during startup the safemode is > taking a really long time and retry would not help much here. > What makes it worse is that when Routers talking to Observers, since Router > distinguishes StandbyException and RetriableException, it will keep retry > (default 3) times and then return to the client an RetriableException. > The change is to make it failover so that Router can immediately try another > Observer or Active namenode (depends on the design). The current > ObserverReadProxyProvider doesn't get affected since both RetriableException > and StandbyException will make it failover. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14914) Observer should throw StandbyException in Safemode
[ https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li updated HDFS-14914: -- Status: Patch Available (was: Open) > Observer should throw StandbyException in Safemode > -- > > Key: HDFS-14914 > URL: https://issues.apache.org/jira/browse/HDFS-14914 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Attachments: HDFS-14914-001.patch > > > When observer is in safemode, calling getBlockLocations will make it throw > RetriableException as in > [HDFS-13898|https://issues.apache.org/jira/browse/HDFS-13898]. However, > during startup the safemode is taking a really long time and retry would not > help much here. > What makes it worse is that when Routers talking to Observers, since Router > distinguishes StandbyException and RetriableException, it will keep retry > (default 3) times and then return to the client an RetriableException. The > client will retry again on the same Router and to the same Observer for > default 10 times, resulting in 3 * 10 = 30 retries per call. > The change is to make it failover so that Router can immediately try another > Observer or Active namenode (depends on the design). The current > ObserverReadProxyProvider doesn't get affected since both RetriableException > and StandbyException will make it failover. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14914) Observer should throw StandbyException in Safemode
[ https://issues.apache.org/jira/browse/HDFS-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fengnan Li updated HDFS-14914: -- Attachment: HDFS-14914-001.patch > Observer should throw StandbyException in Safemode > -- > > Key: HDFS-14914 > URL: https://issues.apache.org/jira/browse/HDFS-14914 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Attachments: HDFS-14914-001.patch > > > When observer is in safemode, calling getBlockLocations will make it throw > RetriableException as in > [HDFS-13898|https://issues.apache.org/jira/browse/HDFS-13898]. However, > during startup the safemode is taking a really long time and retry would not > help much here. > What makes it worse is that when Routers talking to Observers, since Router > distinguishes StandbyException and RetriableException, it will keep retry > (default 3) times and then return to the client an RetriableException. The > client will retry again on the same Router and to the same Observer for > default 10 times, resulting in 3 * 10 = 30 retries per call. > The change is to make it failover so that Router can immediately try another > Observer or Active namenode (depends on the design). The current > ObserverReadProxyProvider doesn't get affected since both RetriableException > and StandbyException will make it failover. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org