Hello guys 

I have a problem using the DistCp to transfer a large file from s3 to HDFS
cluster, whenever I tried to make the copy, I only saw processing work and
memory usage in one of the nodes, not in all of them, I don't know if this
is the proper behaviour of this or if it is a configuration problem. If I
make the transfer of multiple files each node handles a single file at the
same time, I understand that this transfer would be in parallel but it
doesn't seems like that. 

I am using 0.20.2 distribution for hadoop in a two Ec2Instances cluster, I
was hoping that any of you have an idea of how it works distCp and which
properties could I tweak to improve the transfer rate that is currently in
0.7 Gb per minute. 

Regards.
-- 
View this message in context: 
http://old.nabble.com/Transfer-large-file-%3E50Gb-with-DistCp-from-s3-to-cluster-tp34389118p34389118.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Reply via email to