distcp seems to copy between clusters. http://hadoop.apache.org/common/docs/current/distcp.html http://hadoop.apache.org/common/docs/current/distcp.html
zenMonkey wrote: > > I want to write a script that pulls data (flat files) from a remote > machine and pushes that into its hadoop cluster. > > At the moment, it is done in two steps: > > 1 - Secure copy the remote files > 2 - Put the files into HDFS > > I was wondering if it was possible to optimize this by avoiding copying to > local fs before pushing to hdfs; and instead write directly to hdfs. I am > not sure if this is something that hadoop tools already provide. > > Thanks for any help. > > -- View this message in context: http://old.nabble.com/Copying-files-between-two-remote-hadoop-clusters-tp27799963p27813482.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.