[ https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108157#comment-14108157 ]
Dave Marion commented on HDFS-6376: ----------------------------------- bq. Currently DFSUtil#getOnlyNameServiceIdOrNull returns null if there are more than two nameservices specified. There are a couple of places called this method, and looks like DFSHAAdmin#resolveTarget may hit some issue if no -ns option is specified by HAAdmin. Thus I think we may also need to add the exclude logic in DFSUtil#getOnlyNameServiceIdOrNull. And we need to add more tests for this new feature, e.g., to cover its usage in DFSHAAdmin. I tried a change in DFSUtil in my patch6 (see below). I had to back it out as it caused problems. I have had to use -ns in the admin commands and am use to using it now. My point here is that if you have a complex configuration, then you may need to be more specific in the commands that you execute. I think its fair to force the user to specify the -ns argument. {code} +++ hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java @@ -527,7 +527,12 @@ public static String path2String(final Object path) { * @return collection of nameservice Ids, or null if not specified */ public static Collection<String> getNameServiceIds(Configuration conf) { - return conf.getTrimmedStringCollection(DFS_NAMESERVICES); + Collection<String> nameServices = + conf.getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICES); + Collection<String> nameServiceExcludes = + conf.getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICE_CLUSTER_EXCLUDES_KEY); + nameServices.removeAll(nameServiceExcludes); + return nameServices; } {code} > Distcp data between two HA clusters requires another configuration > ------------------------------------------------------------------ > > Key: HDFS-6376 > URL: https://issues.apache.org/jira/browse/HDFS-6376 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, federation, hdfs-client > Affects Versions: 2.2.0, 2.3.0, 2.4.0 > Environment: Hadoop 2.3.0 > Reporter: Dave Marion > Assignee: Dave Marion > Fix For: 3.0.0 > > Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, > HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, > HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, > HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch > > > User has to create a third set of configuration files for distcp when > transferring data between two HA clusters. > Consider the scenario in [1]. You cannot put all of the required properties > in core-site.xml and hdfs-site.xml for the client to resolve the location of > both active namenodes. If you do, then the datanodes from cluster A may join > cluster B. I can not find a configuration option that tells the datanodes to > federate blocks for only one of the clusters in the configuration. > [1] > http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E -- This message was sent by Atlassian JIRA (v6.2#6252)