[
https://issues.apache.org/jira/browse/HDFS-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin Patrick McCabe updated HDFS-3889:
---------------------------------------
Summary: distcp overwrites files even when there are missing checksums
(was: distcp silently ignores missing checksums)
> distcp overwrites files even when there are missing checksums
> -------------------------------------------------------------
>
> Key: HDFS-3889
> URL: https://issues.apache.org/jira/browse/HDFS-3889
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: tools
> Affects Versions: 2.2.0-alpha
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> If distcp can't read the checksum files for the source and destination
> files-- for any reason-- it ignores the checksums and overwrites the
> destination file. It does produce a log message, but I think the correct
> behavior would be to throw an error and stop the distcp.
> If the user really wants to ignore checksums, he or she can use
> {{-skipcrccheck}} to do so.
> The relevant code is in DistCpUtils#checksumsAreEquals:
> {code}
> try {
> sourceChecksum = sourceFS.getFileChecksum(source);
> targetChecksum = targetFS.getFileChecksum(target);
> } catch (IOException e) {
> LOG.error("Unable to retrieve checksum for " + source + " or " +
> target, e);
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira