[ 
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108157#comment-14108157
 ] 

Dave Marion commented on HDFS-6376:
-----------------------------------

bq. Currently DFSUtil#getOnlyNameServiceIdOrNull returns null if there are more 
than two nameservices specified. There are a couple of places called this 
method, and looks like DFSHAAdmin#resolveTarget may hit some issue if no -ns 
option is specified by HAAdmin. Thus I think we may also need to add the 
exclude logic in DFSUtil#getOnlyNameServiceIdOrNull. And we need to add more 
tests for this new feature, e.g., to cover its usage in DFSHAAdmin.

I tried a change in DFSUtil in my patch6 (see below). I had to back it out as 
it caused problems. I have had to use -ns in the admin commands and am use to 
using it now. My point here is that if you have a complex configuration, then 
you may need to be more specific in the commands that you execute. I think its 
fair to force the user to specify the -ns argument.

{code}
+++ 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
@@ -527,7 +527,12 @@ public static String path2String(final Object path) {
    * @return collection of nameservice Ids, or null if not specified
    */
   public static Collection<String> getNameServiceIds(Configuration conf) {
-    return conf.getTrimmedStringCollection(DFS_NAMESERVICES);
+    Collection<String> nameServices =
+      conf.getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICES);
+    Collection<String> nameServiceExcludes =
+      
conf.getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICE_CLUSTER_EXCLUDES_KEY);
+    nameServices.removeAll(nameServiceExcludes);
+    return nameServices;
   }
{code}

> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
>                 Key: HDFS-6376
>                 URL: https://issues.apache.org/jira/browse/HDFS-6376
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, federation, hdfs-client
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0
>         Environment: Hadoop 2.3.0
>            Reporter: Dave Marion
>            Assignee: Dave Marion
>             Fix For: 3.0.0
>
>         Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, 
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, 
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, 
> HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch
>
>
> User has to create a third set of configuration files for distcp when 
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties 
> in core-site.xml and hdfs-site.xml for the client to resolve the location of 
> both active namenodes. If you do, then the datanodes from cluster A may join 
> cluster B. I can not find a configuration option that tells the datanodes to 
> federate blocks for only one of the clusters in the configuration.
> [1] 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to