[ 
https://issues.apache.org/jira/browse/HADOOP-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251654#comment-13251654
 ] 

Dave Thompson commented on HADOOP-8233:
---------------------------------------

Hey Allen,  fwiw, that attachment is not the patch fix for this ticket.  Hope 
you weren't thinking otherwise prior to it being in PA state.

Regarding tests, I've been unit testing by creating different blocksize objects 
from the system default.  Something along the lines of:

 hdfs dfs -Ddfs.blocksize=33554432 -put testData /user/davet/testDataBS32MB

Likewise for zero length:
touch bla
hdfs dfs -put bla /user/davet/bla

distcp is run on the above data with system defaults.   The above tests will 
fail prior to this patch, and will succeed when complete.
                
> Turn CRC checking off for 0 byte size and differing blocksizes
> --------------------------------------------------------------
>
>                 Key: HADOOP-8233
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8233
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.23.3
>            Reporter: Dave Thompson
>            Assignee: Dave Thompson
>         Attachments: HADOOP-8233-branch-0.23.2.patch
>
>
> DistcpV2 (hadoop-tools/hadoop-distcp/..) can fail from checksum failure, 
> sometimes when copying a 0 byte file.    Root cause of this may have to do 
> with an inconsistent nature of HDFS when creating 0 byte files, however 
> distcp can avoid this issue by not checking CRC when size is zero.
> Further, distcp fails checksum when copying from two clusters that use 
> different blocksizes.  In this case it does not make sense to check CRC, as 
> it is a guaranteed failure.
> We need to turn CRC checking off for the above two cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to