[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267787#comment-13267787 ]
stack commented on HBASE-5877: ------------------------------ You don't want to have RegionMovedException carry a ServerName#toString instead of host and port? Or it doesn't make sense when our cached region exceptions are keyed by hostname+port only? Is this a bug fix? {code} @@ -1910,6 +1989,7 @@ public class HConnectionManager { } } catch (ExecutionException e) { LOG.debug("Failed all from " + loc, e); + updateCachedLocations(updateHistory, loc, e); {code} Put the history of moved regions out into its own class? Don't presize this I'd say: + private static final long TIMEOUT_REGION_MOVED = (2L * 60L * 1000L); Stuff is lazily cleared from movedRegions? Should we have a cleaner come visit occasionally? Patch looks fine to me. Nice fat test. bq. 5) The destination is the closeRegion interface is a kind of interface hijacking. Other options would be: Why you say the above? When we protobuf it, it'll just be an option so it shouldn't be too bad? The HCM stuff is ugly but thats not your fault. > When a query fails because the region has moved, let the regionserver return > the new address to the client > ---------------------------------------------------------------------------------------------------------- > > Key: HBASE-5877 > URL: https://issues.apache.org/jira/browse/HBASE-5877 > Project: HBase > Issue Type: Improvement > Components: client, master, regionserver > Affects Versions: 0.96.0 > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Fix For: 0.96.0 > > Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch > > > This is mainly useful when we do a rolling restart. This will decrease the > load on the master and the network load. > Note that a region is not immediately opened after a close. So: > - it seems preferable to wait before retrying on the other server. An > optimisation would be to have an heuristic depending on when the region was > closed. > - during a rolling restart, the server moves the regions then stops. So we > may have failures when the server is stopped, and this patch won't help. > The implementation in the first patch does: > - on the region move, there is an added parameter on the regionserver#close > to say where we are sending the region > - the regionserver keeps a list of what was moved. Each entry is kept 100 > seconds. > - the regionserver sends a specific exception when it receives a query on a > moved region. This exception contains the new address. > - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira