[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

Yongjun Zhang (JIRA) Wed, 20 Feb 2019 21:08:12 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773661#comment-16773661
 ]


Yongjun Zhang commented on HDFS-14118:
--------------------------------------

Hi Guys,

I did one round of review, and it largely looks good to me, except for some 
cosmetic things. Good work [~fengnanli]!

1. DomainNameResolver.java

The class name here is generic, however, the comment stated that this class
 is for namenode. The jira also talked about router (for RBF). Suggest to change
 the comment "for the failover proxy to get IP addresses for the namenode" to
 "for failover proxies (for HA NameNodes, RBF routers etc) to get IP addresses
 of the associated servers"
{code:java}
/**                                                                             
          
 * This interface provides methods for the failover proxy to get IP addresses   
          
 * for the namenode. Implementations will use their own service discovery       
          
 * mechanism, DNS, Zookeeper etc                                                
          
 */                                                                             
          
public interface DomainNameResolver {
{code}
2. code-default.xml
 The description can be changed to 
 "The implementation of DomainNameResolver used for service (HA NameNodes,
 RBF Routers etc) discovery. The default implementation 
 org.apache.hadoop.net.DNSDomainNameResolver returns all IP addresses associated
 with the input domain name of the services by querying the underlying DNS."

3. AbstractNNFailoverProxyProvider.java:
{code:java}
 String host = nameNodeUri.getHost();
 String configKeyWithHost =
        HdfsClientConfigKeys.Failover.RESOLVE_ADDRESS_NEEDED_KEY  + "." + host;
 boolean resolveNeeded = conf.getBoolean(configKeyWithHost,
        HdfsClientConfigKeys.Failover.RESOLVE_ADDRESS_NEEDED_DEFAULT);  
{code}
Most of the time the 'host' here is NN, router etc nameservice instead of host 
name.
 We can change the variable 'host' to 'nameservice' (and 'configKeyWithHost' to
 'configKeyWithNameserivce') OR just add a comment here like:
 // 'host' here would be the name of a nameservice when address resolving
 // is needed.
 to make it easier to read

4. hdfs-default.xml

Suggest to make the following change (some typo, and some are re-arrangement). 
Also
 would you please explain when and why "random order should be enabled" and 
when it
 it's not needed? It seems not clear here.
{code:java}
<property>
  <name>dfs.client.failover.resolve-needed</name>
  <value>false</value>
  <description>
    Determines if the given namenode address is a domain name which needs to be
    resolved (using the resolver configured by 
dfs.client.failover.resolver-impl).
    This adds a transparency layer in the client so physical namenode address
    can change without changing the client. The config key can be appended with 
an optional
    nameservice ID (of form dfs.client.failover.resolve-needed[.nameservice]) 
when multiple
    nameservices exist and random order should be enabled for specific 
nameservices.
  </description>
</property>

<property>
  <name>dfs.client.failover.resolver.impl</name>
  <value>org.apache.hadoop.net.DNSDomainNameResolver</value>
  <description>
    Determines what service the resolving will use from a given namenode domain 
name
    to specific namenode machine address. The config key can be appended with 
an optional
    nameservice ID (of form dfs.client.failover.resolver.impl[.nameservice]) 
when multiple
    nameservices exist and random order should be enabled for specific
    nameservices.
  </description>
</property>
{code}
BTW. The added doc can be further improved by adding a section how to use.

Thanks.

> Use DNS to resolve Namenodes and Routers
> ----------------------------------------
>
>                 Key: HDFS-14118
>                 URL: https://issues.apache.org/jira/browse/HDFS-14118
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Fengnan Li
>            Assignee: Fengnan Li
>            Priority: Major
>         Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

Reply via email to