[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449661#comment-13449661 ] Hudson commented on HDFS-3054: -- Integrated in Hadoop-Mapreduce-trunk #1188 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1188/]) HDFS-3054. distcp -skipcrccheck has no effect. Contributed by Colin Patrick McCabe. (Revision 1381296) Result = ABORTED todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381296 Files : * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Fix For: 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449624#comment-13449624 ] Hudson commented on HDFS-3054: -- Integrated in Hadoop-Hdfs-trunk #1157 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1157/]) HDFS-3054. distcp -skipcrccheck has no effect. Contributed by Colin Patrick McCabe. (Revision 1381296) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381296 Files : * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Fix For: 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449025#comment-13449025 ] Hudson commented on HDFS-3054: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2709 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2709/]) HDFS-3054. distcp -skipcrccheck has no effect. Contributed by Colin Patrick McCabe. (Revision 1381296) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381296 Files : * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Fix For: 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449014#comment-13449014 ] Hudson commented on HDFS-3054: -- Integrated in Hadoop-Hdfs-trunk-Commit #2748 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2748/]) HDFS-3054. distcp -skipcrccheck has no effect. Contributed by Colin Patrick McCabe. (Revision 1381296) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381296 Files : * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Fix For: 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449012#comment-13449012 ] Hudson commented on HDFS-3054: -- Integrated in Hadoop-Common-trunk-Commit #2685 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2685/]) HDFS-3054. distcp -skipcrccheck has no effect. Contributed by Colin Patrick McCabe. (Revision 1381296) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381296 Files : * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyMapper.java * /hadoop/common/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Fix For: 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449000#comment-13449000 ] Todd Lipcon commented on HDFS-3054: --- +1, will commit momentarily. > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448943#comment-13448943 ] Hadoop QA commented on HDFS-3054: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543879/HDFS-3054.004.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-tools/hadoop-distcp. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3149//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3149//console This message is automatically generated. > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Attachments: HDFS-3054.002.patch, HDFS-3054.004.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448883#comment-13448883 ] Todd Lipcon commented on HDFS-3054: --- style nits: - missing space after ',' in RetriableFileCopyCommand's constructor definition, and its usage - please reorder the '@param' in the javadoc to match the order of arguments otherwise +1 > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Attachments: HDFS-3054.002.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448386#comment-13448386 ] Colin Patrick McCabe commented on HDFS-3054: testing done: I confirmed that copying files from a cluster running branch-1 derived code to a cluster running branch-2 derived code did *not* work unless {{-skipcrccheck}} was supplied. The exception was this: {code} Error: java.io.IOException: File copy failed: hftp://172.22.1.204:6001/a/xx --> hdfs://localhost:6000/b/a/xx at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:267) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:153) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:148) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hftp://172.22.1.204:6001/a/xx to hdfs://localhost:6000/b/a/xx at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:263) ... 10 more Caused by: java.io.IOException: Check-sum mismatch between hftp://172.22.1.204:6001/a/xx and hdfs://localhost:6000/b/.distcp.tmp.attempt_1346456743556_0010_m_01_0 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareCheckSums(RetriableFileCopyCommand.java:145) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:107) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:83) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more {code} With {{-skipcrccheck}}, this problem did not occur. > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Attachments: HDFS-3054.002.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446569#comment-13446569 ] Colin Patrick McCabe commented on HDFS-3054: I confirmed through manual testing that -skipcrccheck does indeed cause the crc checking paths to be bypassed. However, I found this in this code, in {{DistCpUtils#checksumsAreEquals}}: {code} try { sourceChecksum = sourceFS.getFileChecksum(source); targetChecksum = targetFS.getFileChecksum(target); } catch (IOException e) { LOG.error("Unable to retrieve checksum for " + source + " or " + target, e); } {code} I think this should be a fatal error for the distcp operation unless {{-skipcrccheck}} is set. Silently ignoring checksums if we can't find them doesn't seem like a good behavior. Perhaps we should open a different JIRA for that, though... > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white >Assignee: Colin Patrick McCabe > Attachments: HDFS-3054.002.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439740#comment-13439740 ] Colin Patrick McCabe commented on HDFS-3054: bq. How about just corrupting the block files manually itself ala TestFSInputChecker? Probably best to add a @VisibleForTesting method in MiniDFSCluster that corrupts the block. MiniDFSCluster is part of HDFS, this isn't. > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white > Attachments: HDFS-3054.002.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439707#comment-13439707 ] Eli Collins commented on HDFS-3054: --- How about just corrupting the block files manually itself ala TestFSInputChecker? Also, let's manually test that we can distcp from a v1 cluster to a v2 cluster with this patch (using skipcrc since HADOOP-8060 is not yet ready). > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white > Attachments: HDFS-3054.002.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439207#comment-13439207 ] Colin Patrick McCabe commented on HDFS-3054: Hi Rahul, This patch looks good to me overall. I created a rebased version of it that incorporates changes from trunk. I wish we could have a unit test for this. It's a little difficult to create a good one without getting kind of uncomfortably coupled to the HDFS code (after all this is in tools). We would probably have to create an API to corrupt file checksums in HDFS, and use that. Then we could be sure that distcp -skipcrccheck was not consulting the CRC. I'm not sure if this is worth the extra work, though... thoughts? > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white > Attachments: HDFS-3054.002.patch, hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432582#comment-13432582 ] Hadoop QA commented on HDFS-3054: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540262/hdfs-3054.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2985//console This message is automatically generated. > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white > Attachments: hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3054) distcp -skipcrccheck has no effect
[ https://issues.apache.org/jira/browse/HDFS-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13432582#comment-13432582 ] Hadoop QA commented on HDFS-3054: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540262/hdfs-3054.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javac. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2985//console This message is automatically generated. > distcp -skipcrccheck has no effect > -- > > Key: HDFS-3054 > URL: https://issues.apache.org/jira/browse/HDFS-3054 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 0.23.2, 2.0.0-alpha, 2.0.1-alpha, 2.2.0-alpha >Reporter: patrick white > Attachments: hdfs-3054.patch > > > Using distcp with '-skipcrccheck' still seems to cause CRC checksums to > happen. > Ran into this while debugging an issue associated with source and destination > having different blocksizes, and not using the preserve blocksize parameter > (-pb). In both 23.1 and 23.2 builds, trying to bypass the checksum > verification by using the '-skipcrcrcheck' parameter had no effect, the > distcp still failed on checksum errors. > Test scenario to reproduce; > do not use '-pb' and try a distcp from 20.205 (default blksize=128M) to .23 > (default blksize=256M), the distcp fails on checksum errors, which is > expected due to checksum calculation (tiered aggregation of all blks). Trying > the same distcp only providing '-skipcrccheck' still fails with the same > checksum error, it is expected that checksum would now be bypassed and the > distcp would proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira