[
https://issues.apache.org/jira/browse/HDFS-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087225#comment-13087225
]
Suresh Srinivas commented on HDFS-1973:
---------------------------------------
Sorry for the late comment. I had been traveling.
Before {{Cases to support}}, could we add a section like this:
>>
On failover, clients need the address of the new active. This could be done by:
# Contacting zookeeper to get the current active NN.
# Alternatively client gets the address of both the namenodes. Tries them one
at a time until it gets connected to the new active.
# For setups using IP failover, clients always use the same VIP/failover
address, which moves to active.
>>
Given this, I am not sure about the {{Cases to support}}:
Proxy based client failover is an implementation details. It still needs to
figure out the new active based on one of the schemes above. I am not very
clear on Configuration based support. Do you mean here, client config will be
changed to point to the new active? DNS SRV records are also unnecessary given
our config would have both the namenode addresses.
+1 for logical URI. We could consider merging this requirement with HDFS-2231
to do this.
Logical URI is needed for identifying a nameservice and not cluster, since
federation supports multiple namenodes with in a cluster. We could use the
concept of nameservice, introduced in federation for that? So URI would be
nameservice1.foo.com. nameservices1 maps to nn1, nn2.
As regards to viewfs, I think this scheme will work for viewfs. The viewfs
mounttables will point to the logical URI, which in turn will use the mechanism
you are proposing.
Why should failover method be based on URI cluster part? Can it be a single
mechanism across all the nameservices? Hence change the parameter to
dfs.client.ha.failover.method?
These are my early thoughts. Some questions I am left with are:
# The scheme you have defined works only for RPC protocols. How about HTTP?
# I am not sure why logical URI is required for VIP/failover based setup.
We could continue to add more details.
> HA: HDFS clients must handle namenode failover and switch over to the new
> active namenode.
> ------------------------------------------------------------------------------------------
>
> Key: HDFS-1973
> URL: https://issues.apache.org/jira/browse/HDFS-1973
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Suresh Srinivas
> Assignee: Aaron T. Myers
>
> During failover, a client must detect the current active namenode failure and
> switch over to the new active namenode. The switch over might make use of IP
> failover or some thing more elaborate such as zookeeper to discover the new
> active.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira