Hi hadoop users,

I am aware that you can set the replication factor of a file after it's
been created, but can you do it as you copy files to the HDFS?  My
hope/intuition is that if you were able to reduce the replication factor of
a file while copying, the copy time would decrease.  I'm finding it
difficult waiting for large data sets to copy over.

I am currently doing:

hadoop dfs -copyFromLocal "/copy/from/path/" input

and am wondering if it's possible to also specify something like -setrep on
the same line.  -setsrep requires you to specify the file, which implies
that it has to exist first, requiring two separate commands.

Thanks in advance,
-Julian

Reply via email to