The EMR distributions have special versions of the s3 file system. They might be helpful here.
Of course, you likely aren't running those if you are seeing 5MB/s. An extreme alternative would be to light up an EMR cluster, copy to it, then to S3. On Thu, Mar 28, 2013 at 4:54 AM, Himanish Kushary <himan...@gmail.com>wrote: > I am thinking either transferring individual folders instead of the entire > 70 GB folders as a workaround or as another option increasing the " > mapred.task.timeout" parameter to something like 6-7 hour ( as the avg > rate of transfer to S3 seems to be 5 MB/s).Is there any other better > option to increase the throughput for transferring bulk data from HDFS to > S3 ? Looking forward for suggestions. >