That's 99% correct. If you want/need to run different versions of HDFS on the two different clusters, then you can't use hdfs:// protocol to access both of them in the same command. In this case, use hdfs://bla/ for the source fs and *hftp*://bla2/ for the dest fs.
- Aaron On Tue, Sep 8, 2009 at 12:45 AM, Anthony Urso <anthony.u...@gmail.com>wrote: > Yes, just run something along the lines of: > > hadoop distcp hdfs://local-namenode/path hdfs://ec2-namenode/path > > on the job tracker of a MapReduce cluster. > > Make sure that your EC2 security group setup allows HDFS access from > the local HDFS cluster and wherever you run MapReduce job from. Also, > I believe both HDFS setups still need to be running on the same > version of Hadoop. > > More here: > > http://hadoop.apache.org/common/docs/r0.20.0/distcp.html > > Cheers, > Anthony > > On Mon, Sep 7, 2009 at 10:37 PM, stchu<stchu.cl...@gmail.com> wrote: > > Hi, > > > > Does Distcp support to copy data from my local cluster (1 master+3 > slaves, > > fs=hdfs) to the EC2 cluster (1master+2slaves, fs=hdfs)? > > If it's supported, how can I do? I appreciate for any guide or > suggestion. > > > > stchu > > >