What is the right way to use the -crc option with hadoop dfs -copyToLocal?
Is this the wrong list?
--Tom
On Tue, Jan 28, 2014 at 11:53 AM, Tom Brown tombrow...@gmail.com wrote:
I am archiving a large amount of data out of my HDFS file system to a
separate shared storage solution (There is
Hi Tom,
My hint is your BLOCKSIZE should be multiple of CRC. Check your property
dfs.block.size - convert it into bytes, then divide it with the checksum
value that is set, usually its dfs.bytes-per-checksum property that tells
this value or you can get the checksum value from the error message
I am using default values for both. My version is 1.1.2, and the default
value for dfs.block.size (67108864) is evenly divisible by 512.
However, the default value online reference for my version (
http://hadoop.apache.org/docs/r1.1.2/hdfs-default.html) doesn't have any
checksum related settings.
I am archiving a large amount of data out of my HDFS file system to a
separate shared storage solution (There is not much HDFS space left in my
cluster, and upgrading it is not an option right now).
I understand that HDFS internally manages checksums and won't succeed if
the data doesn't match