Andrew Purtell created HBASE-18095:
--------------------------------------

             Summary: Provide an option for clients to find the server hosting 
META that does not involve the ZooKeeper client
                 Key: HBASE-18095
                 URL: https://issues.apache.org/jira/browse/HBASE-18095
             Project: HBase
          Issue Type: New Feature
          Components: Client
            Reporter: Andrew Purtell


Clients are required to connect to ZooKeeper to find the location of the 
regionserver hosting the meta table region. Site configuration provides the 
client a list of ZK quorum peers and the client uses an embedded ZK client to 
query meta location. Timeouts and retry behavior of this embedded ZK client are 
managed orthogonally to HBase layer settings and in some cases the ZK cannot 
manage what in theory the HBase client can, i.e. fail fast upon outage or 
network partition.

We should consider new configuration settings that provide a list of well-known 
master and backup master locations, and with this information the client can 
contact any of the master processes directly. Any master in either active or 
passive state will track meta location and respond to requests for it with its 
cached last known location. If this location is stale, the client can ask again 
with a flag set that requests the master refresh its location cache and return 
the up-to-date location. Every client interaction with the cluster thus uses 
only HBase RPC as transport, with appropriate settings applied to the 
connection. The configuration toggle that enables this alternative meta 
location lookup should be false by default.

This removes the requirement that HBase clients embed the ZK client and contact 
the ZK service directly at the beginning of the connection lifecycle. This has 
several benefits. ZK service need not be exposed to clients, and their 
potential abuse, yet no benefit ZK provides the HBase server cluster is 
compromised. Normalizing HBase client and ZK client timeout settings and retry 
behavior - in some cases, impossible, i.e. for fail-fast - is no longer 
necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to