[ https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412902#comment-16412902 ]
genericqa commented on HDFS-13056: ---------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 1s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 23m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 23m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 27s{color} | {color:green} root: The patch generated 0 new + 680 unchanged - 14 fixed = 680 total (was 694) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 7m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 20s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 26s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 34s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 15s{color} | {color:green} hadoop-distcp in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}226m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.TestDFSClientRetries | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8620d2b | | JIRA Issue | HDFS-13056 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12916082/HDFS-13056.012.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc xml | | uname | Linux 49f88ed8cc51 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 28790b8 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/23659/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/23659/testReport/ | | Max. process+thread count | 4519 (vs. ulimit of 10000) | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-tools/hadoop-distcp U: . | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/23659/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Expose file-level composite CRCs in HDFS which are comparable across > different instances/layouts > ------------------------------------------------------------------------------------------------ > > Key: HDFS-13056 > URL: https://issues.apache.org/jira/browse/HDFS-13056 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, distcp, erasure-coding, federation, hdfs > Affects Versions: 3.0.0 > Reporter: Dennis Huo > Assignee: Dennis Huo > Priority: Major > Attachments: HDFS-13056-branch-2.8.001.patch, > HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, > HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, > HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, > HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, > HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, > HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, > HDFS-13056.011.patch, HDFS-13056.012.patch, > Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, > hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf > > > FileChecksum was first introduced in > [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then > has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are > already stored as part of datanode metadata, and the MD5 approach is used to > compute an aggregate value in a distributed manner, with individual datanodes > computing the MD5-of-CRCs per-block in parallel, and the HDFS client > computing the second-level MD5. > > A shortcoming of this approach which is often brought up is the fact that > this FileChecksum is sensitive to the internal block-size and chunk-size > configuration, and thus different HDFS files with different block/chunk > settings cannot be compared. More commonly, one might have different HDFS > clusters which use different block sizes, in which case any data migration > won't be able to use the FileChecksum for distcp's rsync functionality or for > verifying end-to-end data integrity (on top of low-level data integrity > checks applied at data transfer time). > > This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 > during the addition of checksum support for striped erasure-coded files; > while there was some discussion of using CRC composability, it still > ultimately settled on hierarchical MD5 approach, which also adds the problem > that checksums of basic replicated files are not comparable to striped files. > > This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses > CRC composition to remain completely chunk/block agnostic, and allows > comparison between striped vs replicated files, between different HDFS > instances, and possible even between HDFS and other external storage systems. > This feature can also be added in-place to be compatible with existing block > metadata, and doesn't need to change the normal path of chunk verification, > so is minimally invasive. This also means even large preexisting HDFS > deployments could adopt this feature to retroactively sync data. A detailed > design document can be found here: > https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org