[ https://issues.apache.org/jira/browse/HDFS-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446511#comment-16446511 ]
Íñigo Goiri commented on HDFS-13488: ------------------------------------ After adding the unit test for the client failover across Routers, I realized that the exception has to be {{StandbyException}} for it to work. It cannot just be a subclass as the retry policy uses {{RemoteException#unwrapRemoteException(StandbyException.class)}}. For this reason I removed the {{RouterSafeModeException}}. Other than that, [^HDFS-13488.000.patch] should be able to the resiliency to overloaded subclusters. To handle these scenarios, one can also use HDFS-13484. > RBF: Reject requests when a Router is overloaded > ------------------------------------------------ > > Key: HDFS-13488 > URL: https://issues.apache.org/jira/browse/HDFS-13488 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Íñigo Goiri > Assignee: Íñigo Goiri > Priority: Major > Attachments: HDFS-13488.000.patch > > > A Router might be overloaded when handling special cases (e.g. a slow > subcluster). The Router could reject the requests and the client could try > with another Router. We should leverage the Standby mechanism for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org