[
https://issues.apache.org/jira/browse/HADOOP-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508339
]
Doug Cutting commented on HADOOP-1532:
--------------------------------------
Will "verify long" still be needed once HDFS verifies checksums on write?
Currently checksums are generated when writing files and verified when reading.
When data is corrupted in memory before it is written we can end up in a case
where all replicas are corrupt and the data is unusable. But with HADOOP-1134
(or shortly thereafter) checksums can be validated on datanodes as data is
written. Failing tasks can be re-tried until a write succeeds without
corruption. Then data will only be unreadable if all block replicas are
corrupted on disk, which is unlikely.
> Distcp should support verification modes
> ----------------------------------------
>
> Key: HADOOP-1532
> URL: https://issues.apache.org/jira/browse/HADOOP-1532
> Project: Hadoop
> Issue Type: New Feature
> Components: util
> Reporter: Senthil Subramanian
> Fix For: 0.14.0
>
>
> distcp doesnot currently support any verification after copying files. It
> should support
> 1. verify quick (vq) mode - which compares the source and destination CRCs
> 2. verify long (vl) mode - which in addition to verify quick should read the
> entire destination file to catch DFS block level errors
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.