NING DING created HDFS-9868:
-------------------------------

             Summary: add reading source cluster with HA access mode feature 
for DistCp
                 Key: HDFS-9868
                 URL: https://issues.apache.org/jira/browse/HDFS-9868
             Project: Hadoop HDFS
          Issue Type: New Feature
          Components: distcp
    Affects Versions: 2.7.1
            Reporter: NING DING
            Assignee: NING DING


Normally the HDFS cluster is HA enabled. It could take a long time when coping 
huge data by distp. If the source cluster changes active namenode, the distp 
will run failed. This patch supports the DistCp can read source cluster files 
in HA access mode. A source cluster configuration file needs to be specified 
(via the -sourceClusterConf option).

  The following is an example of the contents of a source cluster configuration
  file:
{code:xml}
    <configuration>
      <property>
                <name>fs.defaultFS</name>
                <value>hdfs://mycluster</value>
          </property>
          <property>
                <name>dfs.nameservices</name>
                <value>mycluster</value>
          </property>
          <property>
                <name>dfs.ha.namenodes.mycluster</name>
                <value>nn1,nn2</value>
          </property>
          <property>
                <name>dfs.namenode.rpc-address.mycluster.nn1</name>
                <value>host1:9000</value>
          </property>
          <property>
                <name>dfs.namenode.rpc-address.mycluster.nn2</name>
                <value>host2:9000</value>
          </property>
          <property>
                <name>dfs.namenode.http-address.mycluster.nn1</name>
                <value>host1:50070</value>
          </property>
          <property>
                <name>dfs.namenode.http-address.mycluster.nn2</name>
                <value>host2:50070</value>
          </property>
          <property>
                <name>dfs.client.failover.proxy.provider.mycluster</name>
                
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
          </property>
        </configuration>
{code}
  The invocation of DistCp is as below:
{code}
    bash$ hadoop distcp -sourceClusterConf sourceCluster.xml /foo/bar \
          hdfs://nn2:8020/bar/foo
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to