distcp seems to copy between clusters.

http://hadoop.apache.org/common/docs/current/distcp.html
http://hadoop.apache.org/common/docs/current/distcp.html 




zenMonkey wrote:
> 
> I want to write a script that pulls data (flat files) from a remote
> machine and pushes that into its hadoop cluster.
> 
> At the moment, it is done in two steps:
> 
> 1 - Secure copy the remote files
> 2 - Put the files into HDFS
> 
> I was wondering if it was possible to optimize this by avoiding copying to
> local fs before pushing to hdfs; and instead write directly to hdfs. I am
> not sure if this is something that hadoop tools already provide. 
> 
> Thanks for any help.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Copying-files-between-two-remote-hadoop-clusters-tp27799963p27813482.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Reply via email to