My guess is to put two set of this dfs.ha.namenodes.clusterA=nn1,nn2 dfs.namenode.rpc-address.clusterA.nn1= dfs.namenode.http-address.clusterA.nn1= dfs.namenode.rpc-address.clusterA.nn2= dfs.namenode.http-address.clusterA.nn2=
to the client setting, and then access it like hdfs://clusterA/tmp ... Regards, *Stanley Shi,* On Fri, Apr 18, 2014 at 7:42 AM, david marion <dlmar...@hotmail.com> wrote: > I'm having an issue in client code where there are multiple clusters with > HA namenodes involved. Example setup using Hadoop 2.3.0: > > Cluster A with the following properties defined in core, hdfs, etc: > > dfs.nameservices=clusterA > dfs.ha.namenodes.clusterA=nn1,nn2 > dfs.namenode.rpc-address.clusterA.nn1= > dfs.namenode.http-address.clusterA.nn1= > dfs.namenode.rpc-address.clusterA.nn2= > dfs.namenode.http-address.clusterA.nn2= > > dfs.client.failover.proxy.provider.clusterA=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > Cluster B has similar properties defined in its core-site.xml, > hdfs-site.xml, etc. > > Now, I want to be able to distcp from clusterA to clusterB. Regardless of > which cluster I am executing this from, neither has all of the information. > Looking at DFSClient and DataNode: > > - if I put both clusterA and clusterB into dfs.nameservices, then the > datanodes will try to federate the blocks from both nameservices. > - if I don't put both clusterA and clusterB into dfs.nameservices, then > the client won't know how to resolve both namenodes for the nameservices in > the distcp command. > > I'm wondering if I am missing a property or something that will allow me > to define both nameservice on both clusters and have the datanodes for the > cluster *not* try and federate. Looking at DataNode, it appears that it > tries to connect to all namenodes defined and the first one that sets the > clusterid wins. It seems that there should be a dfs.datanode.clusterid > property that the datanode uses. This seems to line up with 'namenode > -format -clusterid <cluster>' command when you have multiple nameservices. > Am I missing something in the configuration that will allow me to do what I > want? To get distcp to work I had to create a 3 set of configuration files > just for the client to use. >