[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

nkeywal (JIRA) Fri, 04 May 2012 01:07:13 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268199#comment-13268199
 ]


nkeywal commented on HBASE-5877:
--------------------------------

@stack

bq. You don't want to have RegionMovedException carry a ServerName#toString 
instead of host and port?
I think it's safer this way, as I have to parse the string afterward. 
Otherwise, if someone modifies ServerName#toString he will break the parsing in 
RegionMovedException, a class he may never have heard of (yes, it will break 
the unit test :-))

bq. Is this a bug fix?
Unfortunately, it's a feature. The error management is duplicated, and I have 
to manage both cases, because we don't have the exception when we come back to 
the result later in the code.

bq. Put the history of moved regions out into its own class?
You're right, it would be better. Wil do.

bq. Don't presize this I'd say: private static final long TIMEOUT_REGION_MOVED 
= (2L * 60L * 1000L);
You would prefer a configurable value? 


bq. Stuff is lazily cleared from movedRegions? Should we have a cleaner come 
visit occasionally?
Aggreed, it would be better with a cleaner. Will do as well.

bq. Why you say the above? When we protobuf it, it'll just be an option so it 
shouldn't be too bad?
Yeah, if it was too bad I would not have proposed it :-). It's an imperfection 
to accept I think. We would not have it if we share the regions state within 
the cluster with ZK.

@ted
bq. Under what condition would newHrl be null above ?
Oops. Refactoring error. Removed.

bq. Please remove the space between newHrl and ')' below:
Done.

bq. Would the above code result in NPE since I see the following in javadoc:
It should not happen because we test hrl value before. But I added a check on 
the arguments to make it safer.

bq. Since updateCachedLocations() is used to handle exception, the presizing 
above may not be needed.

bq. Since updateCachedLocations() is used to handle exception, the presizing 
above may not be needed.
Yeah, I sized it thinking: if we're doing a rolling restart we may have 100 
regions with a wrong location if we're really unlucky. As it small, any 
solution would work here, but I prefer to have the size explicitly set, as it 
says "I though about it, that's a reasonable size". I added a comment however. 

bq. The indentation of CloseRegionHandler() above is off.
Fixed.


bq. 'will contains' -> 'will contain'. 'keep a too old' -> 'keep too old'.
Fixed.


                
> When a query fails because the region has moved, let the regionserver return 
> the new address to the client
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5877
>                 URL: https://issues.apache.org/jira/browse/HBASE-5877
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, master, regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>             Fix For: 0.96.0
>
>         Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch
>
>
> This is mainly useful when we do a rolling restart. This will decrease the 
> load on the master and the network load.
> Note that a region is not immediately opened after a close. So:
> - it seems preferable to wait before retrying on the other server. An 
> optimisation would be to have an heuristic depending on when the region was 
> closed.
> - during a rolling restart, the server moves the regions then stops. So we 
> may have failures when the server is stopped, and this patch won't help.
> The implementation in the first patch does:
> - on the region move, there is an added parameter on the regionserver#close 
> to say where we are sending the region
> - the regionserver keeps a list of what was moved. Each entry is kept 100 
> seconds.
> - the regionserver sends a specific exception when it receives a query on a 
> moved region. This exception contains the new address.
> - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

Reply via email to