Hello everyone.. I am trying to use distcp with Hadoop HA configuration (using 
CDH4.0.0 at the moment).. Here is my problem:
- I am trying to do a distcp from cluster A to cluster B. Since no operations 
are supported on the standby namenode, I need to specify either the active 
namenode while using distcp or use the failover proxy provider 
(dfs.client.failover.proxy.provider.clusterA) where I can specify the two 
namenodes for cluster B and the failover code inside HDFS will figure it out.. 
- If I use the failover proxy provider, some of my datanodes on cluster A would 
connect to the namenode on cluster B and vice versa. I am assuming that is 
because I have configured both nameservices in my hdfs-site.xml for distcp to 
work.. I have configured dfs.nameservice.id to be the right one but the 
datanodes do not seem to respect that. 

What is the best way to use distcp with Hadoop HA configuration without having 
the datanodes to connect to the remote namenode? Thanks
 
Regards,
Dhaval

Reply via email to