[ 
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105966#comment-14105966
 ] 

Jing Zhao commented on HDFS-6376:
---------------------------------

Thanks for the response, [~dlmarion].

bq. I have been running a version of this patch for about 2 months on a test 
cluster. We are using Hadoop 2 so the patch that I am applying is a little 
different. 

Cool. Then could you also post an updated patch for hadoop 2 since we will 
finally merge the patch to branch-2?

bq. Exclude seemed like a good term for that.

Sounds good to me. Let's keep the current name then.

Some other comments (sorry I should have posted them yesterday...):
# Minor: it may be better to wrap the logic of the following code into a new 
method in DFSUtil since we use it in multiple places.
{code}
+    Map<String, Map<String, InetSocketAddress>> newAddressMap =
+        DFSUtil.getNNServiceRpcAddresses(conf);
+
+    for (String exclude : nameServiceExcludes)
+      newAddressMap.remove(exclude);
{code}
# Currently DFSUtil#getOnlyNameServiceIdOrNull returns null if there are more 
than two nameservices specified. There are a couple of places called this 
method, and looks like DFSHAAdmin#resolveTarget may hit some issue if no -ns 
option is specified by HAAdmin. Thus I think we may also need to add the 
exclude logic in DFSUtil#getOnlyNameServiceIdOrNull. And we need to add more 
tests for this new feature, e.g., to cover its usage in DFSHAAdmin.

> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
>                 Key: HDFS-6376
>                 URL: https://issues.apache.org/jira/browse/HDFS-6376
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, federation, hdfs-client
>    Affects Versions: 2.3.0, 2.4.0
>         Environment: Hadoop 2.3.0
>            Reporter: Dave Marion
>            Assignee: Dave Marion
>             Fix For: 3.0.0
>
>         Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, 
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, 
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, 
> HDFS-6376-patch-1.patch
>
>
> User has to create a third set of configuration files for distcp when 
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties 
> in core-site.xml and hdfs-site.xml for the client to resolve the location of 
> both active namenodes. If you do, then the datanodes from cluster A may join 
> cluster B. I can not find a configuration option that tells the datanodes to 
> federate blocks for only one of the clusters in the configuration.
> [1] 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to