[jira] [Commented] (HBASE-26149) Further improvements on ConnectionRegistry implementations

Michael Stack (Jira) Tue, 10 Aug 2021 13:58:05 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-26149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396887#comment-17396887
 ]


Michael Stack commented on HBASE-26149:
---------------------------------------

The one-pager helped. Thanks. I put it here as the Jira description copying to 
sub-tasks description that was in this document bit missing from the sub-task 
JIRA's desciption. Hopefully makes it easier on others trying to follow-long 
whats going on here (Put some questions on the document for my own 
clarification). THanks.

> Further improvements on ConnectionRegistry implementations
> ----------------------------------------------------------
>
>                 Key: HBASE-26149
>                 URL: https://issues.apache.org/jira/browse/HBASE-26149
>             Project: HBase
>          Issue Type: Umbrella
>          Components: Client
>            Reporter: Duo Zhang
>            Priority: Major
>
> (Copied in-line from the attached 'Documentation' with some filler as 
> connecting script)
> HBASE-23324 Deprecate clients that connect to Zookeeper
> ^^^ This is always our goal, to remove the zookeeper dependency from the 
> client side.
>  
> See the sub-task HBASE-25051 DIGEST based auth broken for MasterRegistry
> When constructing RpcClient, we will pass the clusterid in, and it will be 
> used to select the authentication method. More specifically, it will be used 
> to select the tokens for digest based authentication, please see the code in 
> BuiltInProviderSelector. For ZKConnectionRegistry, we do not need to use 
> RpcClient to connect to zookeeper, so we could get the cluster id first, and 
> then create the RpcClient. But for MasterRegistry/RpcConnectionRegistry, we 
> need to use RpcClient to connect to the ClientMetaService endpoints and then 
> we can call the getClusterId method to get the cluster id. Because of this, 
> when creating RpcClient for MasterRegistry/RpcConnectionRegistry, we can only 
> pass null or the default cluster id, which means the digest based 
> authentication is broken.
> This is a cyclic dependency problem. Maybe a possible way forward, is to make 
> getClusterId method available to all users, which means it does not require 
> any authentication, so we can always call getClusterId with simple 
> authentication, and then at client side, once we get the cluster id, we 
> create a new RpcClient to select the correct authentication way.
> The work in the sub-task, HBASE-26150 Let region server also carry 
> ClientMetaService, is work to make it so the RegionServers can carry a 
> ConnectionRegistry (rather than have the Masters-only carry it as is the case 
> now). Adds a new method getBootstrapNodes to ClientMetaService, the 
> ConnectionRegistry proto Service, for refreshing the bootstrap nodes 
> periodically or on error. The new *RpcConnectionRegistry*  [Created here but 
> defined in the next sub-task]will use this method to refresh the bootstrap 
> nodes, while the old MasterRegistry will use the getMasters method to refresh 
> the ‘bootstrap’ nodes.
> The getBootstrapNodes method will return all the region servers, so after the 
> first refreshing, the client will go to region servers for later rpc calls. 
> But since masters and region servers both implement the ClientMetaService 
> interface, it is free for the client to configure master as the initial 
> bootstrap nodes.
> The following sub-task then deprecates MasterRegistry, HBASE-26172 Deprecated 
> MasterRegistry
> The implementation of MasterRegistry is almost the same with 
> RpcConnectionRegistry except that it uses getMasters instead of 
> getBootstrapNodes to refresh the ‘bootstrap’ nodes connected to. So we could 
> add configs in server side to control what nodes we want to return to client 
> in getBootstrapNodes, i.e, master or region server, then the 
> RpcConnectionRegistry can fully replace the old MasterRegistry. Deprecates 
> the MasterRegistry.
> Sub-task HBASE-26173 Return only a sub set of region servers as bootstrap 
> nodes
> For a large cluster which may have thousands of region servers, it is not a 
> good idea to return all the region servers as bootstrap nodes to clients. So 
> we should add a config at server side to control the max number of bootstrap 
> nodes we want to return to clients. I think the default value could be 5 or 
> 10, which is enough.
> Sub-task HBASE-26174 Make rpc connection registry the default registry on 
> 3.0.0
> Just a follow up of HBASE-26172. MasterRegistry has been deprecated, we 
> should not make it default for 3.0.0 any more.
> Sub-task HBASE-26180 Introduce a initial refresh interval for 
> RpcConnectionRegistry
> As end users could configure any nodes in a cluster as the initial bootstrap 
> nodes, it is possible that different end users will configure the same 
> machine which makes the machine over load. So we should have a shorter delay 
> for the initial refresh, to let users quickly switch to the bootstrap nodes 
> we want them to connect to.
> Sub-task HBASE-26181 Region server and master could use itself as 
> ConnectionRegistry
> This is an optimization to reduce the pressure on zookeeper. For 
> MasterRegistry, we do not want to use it as the ConnectionRegistry for our 
> cluster connection because:
>     // We use ZKConnectionRegistry for all the internal communication, 
> primarily for these reasons:
>     // - Decouples RS and master life cycles. RegionServers can continue be 
> up independent of
>     //   masters' availability.
>     // - Configuration management for region servers (cluster internal) is 
> much simpler when adding
>     //   new masters or removing existing masters, since only clients' config 
> needs to be updated.
>     // - We need to retain ZKConnectionRegistry for replication use anyway, 
> so we just extend it for
>     //   other internal connections too.
> The above comments are in our code, in the HRegionServer.cleanupConfiguration 
> method.
> But since now, masters and regionservers both implement the ClientMetaService 
> interface, we are free to just let the ConnectionRegistry to make use of 
> these in memory information directly, instead of going to zookeeper again.
> Sub-task HBASE-26182 Allow disabling refresh of connection registry endpoint
> One possible deployment in production is to use something like a lvs in front 
> of all the region servers to act as a LB, so clients just need to connect to 
> the lvs IP instead of going to the region server directly to get registry 
> information.
> For this scenario we do not need to refresh the endpoints any more.
> The simplest way is to set the refresh interval to -1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-26149) Further improvements on ConnectionRegistry implementations

Reply via email to