[ 
https://issues.apache.org/jira/browse/HBASE-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970984#comment-16970984
 ] 

Bharath Vissapragada edited comment on HBASE-18095 at 11/10/19 12:43 AM:
-------------------------------------------------------------------------

Thanks Nick, for reviewing the PR. I'm waiting for a couple of dependent 
changes ([807|https://github.com/apache/hbase/pull/807], 
[812|https://github.com/apache/hbase/pull/812]) to be merged before I rebase my 
original PR. There is a lot of scope of refactoring and you already highlighted 
some of them. I'll attempt to address your comments during the rebase.

The goal is to expose these client services RPCs in all the masters and not 
just the active masters. Based on my limited experience here, the reason I 
think this was not targeted for all the cluster members is for the following 
reasons.
 - Client configuration overhead. All the clients now need to know the client 
services endpoints (hostnames and ports) of all the servers before they attempt 
to make their first connection. This is somewhat manageable if we limit it just 
to the masters (I think).
 - Adding to the above point, I think there is a greater likelihood of region 
servers being removed/added from a cluster maintenance/scaling perspective. In 
such cases, all the client configuration needs to be updated, otherwise clients 
keep hitting dead servers or might not hit some newly added servers etc.
 - More state management overhead with higher cluster<->ZK traffic because each 
cluster member now needs to watch for meta region location changes / keep track 
of ClusterID etc.

Again, this is just my understanding. Probably [~apurtell] has a better answer.


was (Author: bharathv):
Thanks Nick, for reviewing the PR. I'm waiting for a couple of dependent 
changes ([807|https://github.com/apache/hbase/pull/807], 
[812|https://github.com/apache/hbase/pull/812]) to be merged before I rebase my 
original PR. There is a lot of scope of refactoring and you already highlighted 
some of them. I'll attempt to address your comments during the rebase.

The goal is to expose these client services RPCs to all the masters and not 
just the active masters. Based on my limited experience here, the reason I 
think this was not targeted for all the cluster members for the following 
reasons.
 - Client configuration overhead. All the clients now need to know the client 
services endpoints (hostnames and ports) of all the servers before they attempt 
to make their first connection. This is somewhat manageable if we limit it just 
to the masters (I think).
 - Adding to the above point, I think there is a greater likelihood of region 
servers being removed/added from a cluster maintenance/scaling perspective. In 
such cases, all the client configuration needs to be updated, otherwise clients 
keep hitting dead servers or might not hit some newly added servers etc.
 - More state management overhead with higher cluster<->ZK traffic because each 
cluster member now needs to watch for meta region location changes / keep track 
of ClusterID etc.

Again, this is just my understanding. Probably [~apurtell] has a better answer.

> Provide an option for clients to find the server hosting META that does not 
> involve the ZooKeeper client
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-18095
>                 URL: https://issues.apache.org/jira/browse/HBASE-18095
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client
>            Reporter: Andrew Kyle Purtell
>            Assignee: Bharath Vissapragada
>            Priority: Major
>         Attachments: HBASE-18095.master-v1.patch, HBASE-18095.master-v2.patch
>
>
> Clients are required to connect to ZooKeeper to find the location of the 
> regionserver hosting the meta table region. Site configuration provides the 
> client a list of ZK quorum peers and the client uses an embedded ZK client to 
> query meta location. Timeouts and retry behavior of this embedded ZK client 
> are managed orthogonally to HBase layer settings and in some cases the ZK 
> cannot manage what in theory the HBase client can, i.e. fail fast upon outage 
> or network partition.
> We should consider new configuration settings that provide a list of 
> well-known master and backup master locations, and with this information the 
> client can contact any of the master processes directly. Any master in either 
> active or passive state will track meta location and respond to requests for 
> it with its cached last known location. If this location is stale, the client 
> can ask again with a flag set that requests the master refresh its location 
> cache and return the up-to-date location. Every client interaction with the 
> cluster thus uses only HBase RPC as transport, with appropriate settings 
> applied to the connection. The configuration toggle that enables this 
> alternative meta location lookup should be false by default.
> This removes the requirement that HBase clients embed the ZK client and 
> contact the ZK service directly at the beginning of the connection lifecycle. 
> This has several benefits. ZK service need not be exposed to clients, and 
> their potential abuse, yet no benefit ZK provides the HBase server cluster is 
> compromised. Normalizing HBase client and ZK client timeout settings and 
> retry behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
> necessary. 
> And, from [~ghelmling]: There is an additional complication here for 
> token-based authentication. When a delegation token is used for SASL 
> authentication, the client uses the cluster ID obtained from Zookeeper to 
> select the token identifier to use. So there would also need to be some 
> Zookeeper-less, unauthenticated way to obtain the cluster ID as well. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to