[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341411#comment-14341411
 ] 

Arun Suresh commented on HDFS-7858:
-----------------------------------

[~bikassaha], you make a very valid point.

I guess the situation you mentioned can be alleviated as follows : 
Considering the fact a client knows apriori, both the Active and Standby, what 
if we do the following : if locally cached active namenode entry has become 
unavailable, yes there will be an initial surge of requests to the failed NN, 
but the client can directly retry to the Standby without consulting ZK. ZK 
connections will happen only in the following cases :
# If no cached entry is present in the user home directory.
# Long living clients 

Also I was thinking, maybe we break this into 2 separate JIRAs :
# Adding a cached entry to user's home dir to pick last active NN. If entry is 
not present, the client picks the Standby from the configuration. No ZK 
involvement for this, it only brings some determinism in which namenode is 
picked first. 
# Have another JIRA to add ZK client optimization. This would in addition to 
the ZK watch feature for long lived clients can bring in probably additional 
benefits such as having only the logical nameservice name in the Configuration. 
Namenodes when it starts up will register under a ZNode and clients find out 
the actual URI of the Active and Standby directly from ZK (like HBase clients). 
Short lived clients would then first query ZK, finding the active and standby 
NN URIs and cache them (rather than reading from the Configuration), so 
subsequent Client invocation do not hit ZK. 

> Improve HA Namenode Failover detection on the client using Zookeeper
> --------------------------------------------------------------------
>
>                 Key: HDFS-7858
>                 URL: https://issues.apache.org/jira/browse/HDFS-7858
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>
> In an HA deployment, Clients are configured with the hostnames of both the 
> Active and Standby Namenodes.Clients will first try one of the NNs 
> (non-deterministically) and if its a standby NN, then it will respond to the 
> client to retry the request on the other Namenode.
> If the client happens to talks to the Standby first, and the standby is 
> undergoing some GC / is busy, then those clients might not get a response 
> soon enough to try the other NN.
> Proposed Approach to solve this :
> 1) Since Zookeeper is already used as the failover controller, the clients 
> could talk to ZK and find out which is the active namenode before contacting 
> it.
> 2) Long-lived DFSClients would have a ZK watch configured which fires when 
> there is a failover so they do not have to query ZK everytime to find out the 
> active NN
> 2) Clients can also cache the last active NN in the user's home directory 
> (~/.lastNN) so that short-lived clients can try that Namenode first before 
> querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to