[ https://issues.apache.org/jira/browse/MAPREDUCE-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tsz Wo (Nicholas), SZE updated MAPREDUCE-1231: ---------------------------------------------- Fix Version/s: (was: 0.20.2) Issue Type: Improvement (was: Bug) Patch looks mostly good. Some comments below. - Why renaming testCopyDuplication() to atestCopyDuplication()? Typo? - If -skipcrccheck is specified without -update, it should print an error message and exit. - Could you change ".ignore.crc" in the following to ".skip.crc.check"? It better to keep the wording consistent. {code} + SKIPCRC("-skipcrccheck", NAME + ".ignore.crc"); {code} > Distcp is very slow > ------------------- > > Key: MAPREDUCE-1231 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1231 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp > Affects Versions: 0.20.1 > Reporter: Jothi Padmanabhan > Assignee: Jothi Padmanabhan > Attachments: mapred-1231-v1.patch, mapred-1231-v2.patch, > mapred-1231-y20-v2.patch, mapred-1231-y20.patch, mapred-1231.patch > > > Currently distcp does a checksums check in addition to file length check to > decide if a remote file has to be copied. If the number of files is high > (thousands), this checksum check is proving to be fairly costly leading to a > long time before the copy is started. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.