[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15309065#comment-15309065 ] Rakesh R commented on HDFS-9833: Thanks a lot [~drankye] for the good support! > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307519#comment-15307519 ] Kai Zheng commented on HDFS-9833: - The latest patch LGTM and +1. Will commit it tomorrow. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307427#comment-15307427 ] Rakesh R commented on HDFS-9833: Test case failures {{TestRollingUpgrade.testRollback}} and {{TestEditLog.testBatchedSyncWithClosedLogs}} are not related to my patch, pls ignore it. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307410#comment-15307410 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 58s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 56s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 104m 12s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12807038/HDFS-9833-08.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 69937bcccfc9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 93d8a7f | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15612/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HDFS-Build/15612/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results |
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307210#comment-15307210 ] Rakesh R commented on HDFS-9833: Thanks [~drankye] for the detailed reviews. Attached patch addressing these comments. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch, HDFS-9833-08.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307142#comment-15307142 ] Kai Zheng commented on HDFS-9833: - Thanks [~rakeshr] for the update. It looks much close now. I have two more comments: 1. Would it be good to use {{StripedReconstructionInfo}} in the {{StripedReconstructor}} family? I mean, we can convert {{BlockECReconstructionInfo}} to StripedReconstructionInfo to pass down, and we can use it in the base constructor and so on. {code} + public StripedBlockChecksumReconstructor(ErasureCodingWorker worker, + ExtendedBlock blockGroup, DatanodeInfo[] srcNodes, byte[] liveIndices, + int[] targetIndices, ErasureCodingPolicy ecPolicy, + DataOutputBuffer checksumWriter) throws IOException {code} 2. How about {{ECBlockInfo}} => {{LiveBlockInfo}}, and {{setChecksumProperties}} => {{setOrCheckChecksumProperties}}? +1 once addressed. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306849#comment-15306849 ] Hadoop QA commented on HDFS-9833: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 44s {color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 103m 52s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806989/HDFS-9833-07.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux b26d85edcff0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4e1f56e | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15605/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15605/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Erasure coding: recomputing block checksum on the fly by
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306791#comment-15306791 ] Rakesh R commented on HDFS-9833: It seems HADOOP-13010 has changed {{StripedReconstructor.java}} file. I've rebased the patch on top of latest trunk code. Also, used {{Map}} instead of {{HashMap}}. [~drankye], kindly review latest patch when you get a chance. Thanks! > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch, HDFS-9833-07.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306265#comment-15306265 ] Rakesh R commented on HDFS-9833: Test case failures are unrelated to my patch, please ignore it. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300537#comment-15300537 ] Rakesh R commented on HDFS-9833: [~drankye], I have created the follow-on tasks HDFS-10460 and HDFS-10461. I will start working on these after the basic patch in this jira is committed. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300329#comment-15300329 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 21s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 111m 39s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeMXBean | | | hadoop.hdfs.TestAsyncDFSRename | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806134/HDFS-9833-06.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 8ccfe771cabb 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9a31e5d | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15560/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HDFS-Build/15560/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results |
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300129#comment-15300129 ] Rakesh R commented on HDFS-9833: Attached new patch fixing checkstyle warnings. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, > HDFS-9833-05.patch, HDFS-9833-06.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300065#comment-15300065 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} hadoop-hdfs-project: patch generated 3 new + 118 unchanged - 0 fixed = 121 total (was 118) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 34s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 83m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestEditLog | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12806119/HDFS-9833-05.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 6b2479e80ca6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / dcbb700 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15559/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15559/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test logs |
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1521#comment-1521 ] Rakesh R commented on HDFS-9833: Thanks a lot [~drankye] for reviewing the patch and offline discussions. I've uploaded new patch addressing the comments except 2nd point. Also, added new test case to verifiy checksum after node decommissioning(block locations will be duplicated after decommn operation). bq. HashMap here might be little heavy, an array should work instead. Checksum logic is using {{namenode.getBlockLocations(src, start, length)}} to get the block locations. This list is not guaranteeing any order and also list contains duplicated block info(index and its source node). Now, while computing the block checksum it needs to skip the block which is already considered previously. With {{HashMap}} all these cases will be handled internally(removes duplicate index and maintains ascending order), I feel this makes the logic simple. Also, this hashmap is used locally and contains only very few entries. With array, we need to add extra logic to skip the duplicate nodes and may need to add sorting logic. Whats your opinion to use existing hashmap? Below is sample block indices list after the decommissioning operation. {{'}} represents decommissioned node index. Here, this list contains duplicated blocks and not maintaining any order. {code} 0, 2, 3, 4, 5, 6, 7, 8, 1, 1' {code} > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, HDFS-9833-05.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299652#comment-15299652 ] Kai Zheng commented on HDFS-9833: - 1. The following two would be good to have the same names: {code} + static public byte[] convertBlockIndices(List blockIndices) { +@SuppressWarnings("unchecked") +byte[] blkIndices = new byte[blockIndices.size()]; +for (int i = 0; i < blockIndices.size(); i++) { + blkIndices[i] = (byte) blockIndices.get(i).intValue(); +} +return blkIndices; + } + + public static List convert(byte[] blockIndices) { +List results = new ArrayList<>(blockIndices.length); +for (byte bt : blockIndices) { + results.add(Integer.valueOf(bt)); +} +return results; + } + {code} 2. HashMap here might be little heavy, an array should work instead. {code} HashMapliveDns = new HashMap<>(datanodes.length); {code} 3. The logic below looks like to recompute only one block checksum a time. The reconstruction can be done just by a time if multiple blocks are in the question. Sure it can be updated in follow-on tasks. {code} + for (int idx = 0; idx < numDataUnits && idx < blkIndxLen; idx++) { +try { + ECBlockInfo ecBlkInfo = liveDns.get((byte) idx); + if (null == ecBlkInfo) { +// reconstruct block and calculate checksum for missing node +recalculateChecksum(idx); + } else { +try { + ExtendedBlock block = StripedBlockUtil.constructInternalBlock( + blockGroup, ecPolicy.getCellSize(), numDataUnits, idx); + checksumBlock(block, idx, ecBlkInfo.getToken(), + ecBlkInfo.getDn()); +} catch (IOException ioe) { + LOG.warn("Exception while reading checksum", ioe); + // reconstruct block and calculate checksum for the failed node + recalculateChecksum(idx); +} + } +} catch (IOException e) { + LOG.warn("Failed to get the checksum", e); +} {code} 4. Could we have some wrapper like *ReconstructionInfo* to contain the relevant parameters? I'm afraid we may need more in future ... {code} StripedReader(StripedReconstructor reconstructor, DataNode datanode, -Configuration conf, -BlockECReconstructionInfo reconstructionInfo) { + Configuration conf, ErasureCodingPolicy ecPolicy, + ExtendedBlock blockGroup, byte[] liveIndices, DatanodeInfo[] sources) { {code} 5. The following *TODO* should be resolved since you have added two tests in datanode failures. {code} // TODO: allow datanode failure, HDFS-9833 {code} > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298307#comment-15298307 ] Kai Zheng commented on HDFS-9833: - Thanks Rakesh for the update on this. I will take a careful review tomorrow. Sounds good to me to do the tasks split up and would you please go ahead. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298145#comment-15298145 ] Rakesh R commented on HDFS-9833: Hi, [~drankye], [~umamaheswararao] would be great to see feedback on the latest patch. I will create separate jira tasks if you agree on [tasks split up|https://issues.apache.org/jira/browse/HDFS-9833?focusedCommentId=15295644=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15295644] mentioned earlier in this jira and will try to push this basic patch in. Thanks! Note: I feel the checkstyle warning can be ignored, if needed will rename the args in next patch preparation time. Also, the test case failure is unrelated to my patch. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298102#comment-15298102 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 28s {color} | {color:red} hadoop-hdfs-project: patch generated 1 new + 104 unchanged - 0 fixed = 105 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 37s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 95m 55s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805847/HDFS-9833-04.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux af060e86d437 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b4078bd | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15543/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15543/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298047#comment-15298047 ] Rakesh R commented on HDFS-9833: I'm attaching new patch. Here I corrected the reconstruction of missing blocks and its checksum calculation logic. Also, added one more unit test case. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297903#comment-15297903 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} hadoop-hdfs-project: patch generated 8 new + 104 unchanged - 0 fixed = 112 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 58s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m 16s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Unread field:BlockChecksumHelper.java:[line 346] | | | Should org.apache.hadoop.hdfs.server.datanode.BlockChecksumHelper$BlockGroupNonStripedChecksumComputer$ECBlockInfo be a _static_ inner class? At BlockChecksumHelper.java:inner class? At BlockChecksumHelper.java:[lines 353-363] | | Failed junit tests | hadoop.hdfs.TestAsyncDFSRename | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805825/HDFS-9833-03.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 6c5e4653856e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297887#comment-15297887 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} hadoop-hdfs-project: patch generated 10 new + 103 unchanged - 0 fixed = 113 total (was 103) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 1s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 8s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 97m 30s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Unread field:BlockChecksumHelper.java:[line 347] | | | Should org.apache.hadoop.hdfs.server.datanode.BlockChecksumHelper$BlockGroupNonStripedChecksumComputer$ECBlockInfo be a _static_ inner class? At BlockChecksumHelper.java:inner class? At BlockChecksumHelper.java:[lines 354-364] | | Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.TestAsyncDFSRename | | | hadoop.hdfs.TestMissingBlocksAlert | | | hadoop.hdfs.TestDFSUpgradeFromImage | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805822/HDFS-9833-03.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 41f122c5b459 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295644#comment-15295644 ] Rakesh R commented on HDFS-9833: [~drankye], Attached patch is addressing basic checksum recalculation of a whole block reconstruction. Like we discussed earlier in this jira, could raise subsequent sub tasks for handling the following cases once the basic patch is in. # handling multiple block failures within a block group. # handling partial block reconstruction and recalculate checksum. I meant, getting checksum for a particular range less than file size cases. Note: Test case failure is unrelated to the patch and this failure will be addressed using HDFS-10434. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295519#comment-15295519 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 28s {color} | {color:red} hadoop-hdfs-project: patch generated 7 new + 104 unchanged - 0 fixed = 111 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 55s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 97m 57s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805518/HDFS-9833-02.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 8a8e27501bee 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d8c1fd1 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15521/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/15521/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | unit test
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295497#comment-15295497 ] Rakesh R commented on HDFS-9833: Uploaded another patch fixing failures. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, > HDFS-9833-02.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289479#comment-15289479 ] Hadoop QA commented on HDFS-9833: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} hadoop-hdfs-project: patch generated 17 new + 104 unchanged - 0 fixed = 121 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 32s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 52s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 91m 21s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | org.apache.hadoop.hdfs.protocolPB.PBHelperClient.convert(byte[]) invokes inefficient new Integer(int) constructor; use Integer.valueOf(int) instead At PBHelperClient.java:constructor; use Integer.valueOf(int) instead At PBHelperClient.java:[line 868] | | Failed junit tests | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics | | | hadoop.hdfs.TestAsyncDFSRename | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12804697/HDFS-9833-01.patch | | JIRA Issue | HDFS-9833 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 763ddb0c750f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287275#comment-15287275 ] Kai Zheng commented on HDFS-9833: - The big patch looks pretty good. Thanks Rakesh! Some minor comments so far according to a quick look. * Unexpected change in PBHelperClient? {code} -case ENTERING_MAINTENANCE: - return DatanodeInfoProto.AdminState.ENTERING_MAINTENANCE; -case IN_MAINTENANCE: - return DatanodeInfoProto.AdminState.IN_MAINTENANCE; {code} * Good idea to have {{StripedBlockReconstructor}} and {{StripedBlockChecksumReconstructor}} by extending {{StripedReconstructor}}. For StripedBlockChecksumReconstructor, the name of {{md5Writer}} may be renamed to a general one like {{checksumWriter}}? And {{reconstructAndTransfer}} could be {{reconstruct}} or {{reconstructChecksum}} as no transferring will happen here. * I thought the original main comments in StripedReconstructor would be better to remain there because the rough idea still applies to the common base and can be shared by the both subclasses. Look forward to the formal patch! > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284221#comment-15284221 ] Kai Zheng commented on HDFS-9833: - Thanks [~rakeshr] for the big work and I will take some look giving my feedback. bq. How about optimizing the checksum recomputation logic to address multiple datanode failures and reconstructing it together through another sub-task? Sounds a good plan. Handling multiple block failures wouldn't impact big to the existing codes that can work for single block failure. This is similar to the ECWorker. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284207#comment-15284207 ] Rakesh R commented on HDFS-9833: The attached patch is addressing only one target datanode failure at a time and reconstruction it. I meant, when iterating over blockGroup, if it finds a missing index or an exception then will reconstruct this index data and re-calculate the block checksum for this block. How about optimizing the checksum recomputation logic to address multiple datanode failures and reconstructing it together through another sub-task? > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284155#comment-15284155 ] Rakesh R commented on HDFS-9833: [~drankye], [~umamaheswararao], I'm attaching a draft patch to show the proposed algo and the class responsibilities. Kindly go through the changes and would like to see your feedback. Thanks!. I will refine and add more unit test cases in subsequent patches. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-9833-00-draft.patch > > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275921#comment-15275921 ] Rakesh R commented on HDFS-9833: Thanks [~drankye] for the detailed explanation. I will consider this point and make the necessary changes. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275587#comment-15275587 ] Kai Zheng commented on HDFS-9833: - To clarify my above comments: from practical point of view, it should be good enough to consider just numDataUnits datanodes because it's not likely all the data block resident datanodes all fail, and if it does happen, continuing to try parity block resident datanode won't help. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268438#comment-15268438 ] Kai Zheng commented on HDFS-9833: - Thanks Rakesh for your understanding. It may help to explain about this in other words according to mine. You're right in client side we need to try each datanode in the group and let it do the block group checksum computing. It includes datanodes of both data blocks and and parity blocks because parity block datanodes can also do the same work. Anyway when a datanode in the group is requested to do the computing work, it will request/collect all the checksums for the blocks in the group to compute the block group level checksum to respond to the client call. When all the blocks are fine the existing block checksums are just requested remotely/locally and used, but in case some data block is erased, the similar reconstruction task will be executed on the requested datanode to recompute the block checksum on the fly. Anyway when it fails then it will return failure to the client instead of the normal block group checksum. When the client receives failure it means the requested datanode isn't able to do the work so it will retry with next datanode in the group. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268251#comment-15268251 ] Rakesh R commented on HDFS-9833: Following is the brief idea about the proposed approach. Kindly go through this and would be great to see the feedback on this. Thanks! In our existing striped checksum logic, client is connecting to the first datanode in the block locations and sending {{Op.BLOCK_GROUP_CHECKSUM}} command. He will iterate over {{ecPolicy.getNumDataUnits()}} datanodes and invokes {{Op.BLOCK_CHECKSUM}} command one by one. During these operations it can hit {{IOException}} and fail the checksum call. To begin with, I think will catch generic {{IOException}} while performing operation on a datanode. The block corresponding to the failed datanode will be chosen for reconstruction and then recompute checksum with the reconstructed block data. # Datanode side changes: If there is an IOException while performing {{Op.BLOCK_CHECKSUM}} command then it will consider this block for reconstruction and calculate its checksum. Again the reconstruction errors will fail the checksum call. # Client side changes: Presently {{FileChecksumHelper#checksumBlockGroup()}} function is throwing IOException back to the client if the first datanode has errors, instead will try connecting to {{#getNumParityUnits()}} number of datanodes before failing the checksum operation. Thanks [~umamaheswararao] for the offline discussions. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250095#comment-15250095 ] Rakesh R commented on HDFS-9833: Thank you [~drankye], I will soon come up with a proposal. > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data
[ https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249965#comment-15249965 ] Kai Zheng commented on HDFS-9833: - Per off-line discussion with [~rakeshr], he'd like to help with this and so reassigned. Thanks Rakesh for the taking! > Erasure coding: recomputing block checksum on the fly by reconstructing the > missed/corrupt block data > - > > Key: HDFS-9833 > URL: https://issues.apache.org/jira/browse/HDFS-9833 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rakesh R > Labels: hdfs-ec-3.0-must-do > > As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum > even some of striped blocks are missed, we need to consider recomputing block > checksum on the fly for the missed/corrupt blocks. To recompute the block > checksum, the block data needs to be reconstructed by erasure decoding, and > the main needed codes for the block reconstruction could be borrowed from > HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC > worker, reconstructed blocks need to be written out to target datanodes, but > here in this case, the remote writing isn't necessary, as the reconstructed > block data is only used to recompute the checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)