how to copy local files to hdfs quickly?

Andy Davidson Wed, 27 Jul 2016 16:25:48 -0700

I have a spark streaming app that saves JSON files to s3:// . It works fine


Now I need to calculate some basic summary stats and am running into
horrible performance problems.

I want to run a test to see if reading from hdfs instead of s3 makes
difference. I am able to quickly copy my the data from s3 to a machine in my
cluster how ever hadoop fs put is pain fully slow. Is there a better way to
copy large data to hdfs?

I should mention I am not using EMR . I.E. According to AWS support there is
no way to have $aws s3¹ copy directory to hdfs://

Hadoop distcp can not copy files from the local files system

Thanks in advance

Andy

how to copy local files to hdfs quickly?

Reply via email to