[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kai Xie updated HADOOP-16158: ----------------------------- Attachment: (was: HADOOP-16158-001.patch) > DistCp to support checksum validation when copy blocks in parallel > ------------------------------------------------------------------ > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp > Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 > Reporter: Kai Xie > Assignee: Kai Xie > Priority: Major > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org