[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941643#comment-16941643 ] Surendra Singh Lilhore edited comment on HDFS-14768 at 10/1/19 8:50 AM: Thanks [~gjhkael] for patch. You need to re-base this patch, not able to apply on trunk. One doubt, why you are running loop "replicationStreamsHardLimit + 3" time to make DN busy ? I feel till {{replicationStreamsHardLimit}} it is ok. {code:java} for (int j = 0; j < replicationStreamsHardLimit + 3; j++) { {code} You can make DN busy by directly calling {{incrementPendingReplicationWithoutTargets()}} {code} for(int i = 0; i < replicationStreamsHardLimit; i++){ busyNode.incrementPendingReplicationWithoutTargets(); } {code} was (Author: surendrasingh): Thanks [~gjhkael] for patch. You need to re-base this patch, not able to apply on trunk. One doubt, why you are running loop "replicationStreamsHardLimit + 3" time to make DN busy ? I feel till {{replicationStreamsHardLimit}} it is ok. {code:java} for (int j = 0; j < replicationStreamsHardLimit + 3; j++) { {code} > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, 1568771471942.jpg, > HDFS-14768.000.patch, HDFS-14768.001.patch, HDFS-14768.002.patch, > HDFS-14768.jpg, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInf
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931021#comment-16931021 ] guojh edited comment on HDFS-14768 at 9/17/19 2:34 AM: --- [~surendrasingh] I put a new patch based on the latest code. Could you help to review it? was (Author: gjhkael): [~surendrasingh] I put a new patch based on the latest code. > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, > HDFS-14768.000.patch, HDFS-14768.0001.patch, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); > assertEquals(dec
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930251#comment-16930251 ] Zhao Yi Ming edited comment on HDFS-14768 at 9/16/19 6:19 AM: -- [~gjhkael] Thanks for your reply! Here I am NOT very clear about the "*exist two index 6 block, NN may delete the correct one, if data block is missing, then the incorrect block will encode to generate the data block",* My understand is the index 6 internal block are in the busy DN, then the reconstruction work will recreate the index 6 internal block in another DN, If so, these 2 index 6 internal block are all correct block, anything I missed? Could you tell me more about this. Thanks! was (Author: zhaoyim): [~gjhkael] Thanks for your reply! Here I am NOT very clear about the "*exist two index 6 block, NN may delete the correct one",* My understand is the index 6 internal block are in the busy DN, then the reconstruction work will recreate the index 6 internal block in another DN, If so, these 2 index 6 internal block are all correct block, anything I missed? Could you tell me more about this. Thanks! > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, > HDFS-14768.000.patch, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16930251#comment-16930251 ] Zhao Yi Ming edited comment on HDFS-14768 at 9/16/19 6:18 AM: -- [~gjhkael] Thanks for your reply! Here I am NOT very clear about the "*exist two index 6 block, NN may delete the correct one",* My understand is the index 6 internal block are in the busy DN, then the reconstruction work will recreate the index 6 internal block in another DN, If so, these 2 index 6 internal block are all correct block, anything I missed? Could you tell me more about this. Thanks! was (Author: zhaoyim): [~gjhkael] Thanks for your reply! Here I am NOT very clear about the "*exist two index 6 block",* My understand is the index 6 internal block are in the busy DN, then the reconstruction work will recreate the index 6 internal block in another DN, If so, these 2 index 6 internal block are all correct block, anything I missed? Could you tell me more about this. Thanks! > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, > HDFS-14768.000.patch, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 repli
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928348#comment-16928348 ] guojh edited comment on HDFS-14768 at 9/12/19 8:19 AM: --- @[~surendrasingh] I update the UT in Description to reproducing the issue, So just try it. !1568275810244.jpg! !1568276338275.jpg! was (Author: gjhkael): @[~surendrasingh] I update the UT in Description to reproducing the issue, So just try it. !1568275810244.jpg! > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: 1568275810244.jpg, 1568276338275.jpg, > HDFS-14768.000.patch, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > private int replicationStreamsHardLimit = > DFSConfigKeys.DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_DEFAULT; > numDNs = dataBlocks + parityBlocks + 10; > @Test(timeout = 24) > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > final INodeFile fileNode = cluster.getNamesystem().getFSDirectory() > .getINode4Write(ecFile.toString()).asFile(); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > BlockInfo firstBlock = fileNode.getBlocks()[0]; > DatanodeStorageInfo[] dStorageInfos = bm.getStorages(firstBlock); > // the first heartbeat will consume 3 replica tasks > for (int i = 0; i <= replicationStreamsHardLimit + 3; i++) { > BlockManagerTestUtil.addBlockToBeReplicated(datanodeDescriptor, new > Block(i), > new DatanodeStorageInfo[]{dStorageInfos[0]}); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommission
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928172#comment-16928172 ] Zhao Yi Ming edited comment on HDFS-14768 at 9/12/19 3:45 AM: -- One more information, hope it can give help. Through [~gjhkael]'s UT make the decommission hang, but I check the local data folder, seems the reconstruction worked well. Also with my changed on the UT, the reconstruction worked well too. Attached the folder file list. [^guojh_UT_before_deomission.txt] [^guojh_UT_after_deomission.txt] [^zhaoyiming_UT_beofre_deomission.txt] [^zhaoyiming_UT_after_deomission.txt] was (Author: zhaoyim): One more information, hope it can give help. Through [~gjhkael]'s UT make the decommission hang, but I check the local data folder, seems the reconstruction worked well. Also with my changed on the UT, the reconstruction worked well too. Attached the folder file list. > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: HDFS-14768.000.patch, guojh_UT_after_deomission.txt, > guojh_UT_before_deomission.txt, zhaoyiming_UT_after_deomission.txt, > zhaoyiming_UT_beofre_deomission.txt > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > for (int i = 0; i < 100; i++) { > datanodeDescriptor.incrementPendingReplicationWithoutTargets(); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); > assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); > //assertNull(checkFile(dfs, ecFile, 9,
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928167#comment-16928167 ] Zhao Yi Ming edited comment on HDFS-14768 at 9/12/19 3:34 AM: -- [~gjhkael] Thanks for your update! Here is one question, in your UT if move the decommisionNodes() before getLastLocatedBlock(), will it effect the UT result? Because I did this changes, the UT can run as normal. move following code {code:java} // code placeholder int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); {code} to {code:java} // code placeholder LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); => Here LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); {code} was (Author: zhaoyim): [~gjhkael] Thanks for your update! Here is one question, in your UT if move the decommisionNodes() before getLastLocatedBlock(), will it effect the UT result? Because I did this changes, the UT can run as normal. move following code {code:java} // code placeholder int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); {code} to {code:java} // code placeholder LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); => Here LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); {code} > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: HDFS-14768.000.patch > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * d
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927610#comment-16927610 ] Zhao Yi Ming edited comment on HDFS-14768 at 9/11/19 2:24 PM: -- [~gjhkael] I tried the following code(which you said can stabilize reproduce the problem), but I can always hit the DN can not decommissioned successfully, the LOG said "DatanodeAdminManager - Node 127.0.0.1:49827 still has 1 blocks to replicate before it is a candidate to finish Decommission In Progress.". Anything I missed? Can you help? Thanks! {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0).get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock) locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); // DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem().getBlockManager() .getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = { 3, 4 }; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); // assertNull(checkFile(dfs, ecFile, 9, decommisionNodes, numDNs)); // Ensure decommissioned datanode is not automatically shutdown DFSClient client = getDfsClient(cluster.getNameNode(0), conf); assertEquals("All datanodes must be alive", numDNs, client.datanodeReport(DatanodeReportType.LIVE).length); FileChecksum fileChecksum2 = dfs.getFileChecksum(ecFile, writeBytes); Assert.assertTrue("Checksum mismatches!", fileChecksum1.equals(fileChecksum2)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null, blockGroupSize); } {code} was (Author: zhaoyim): [~gjhkael] I tried the following code(which you said can stabilize reproduce the problem), but I can always hit the DN can not decommissioned successfully, the LOG as following. Anything I missed? Can you help? Thanks! !image-2019-09-11-22-20-50-091.png! {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0).get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock) locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); // DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem().getBlockManager() .getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = { 3, 4 };
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927610#comment-16927610 ] Zhao Yi Ming edited comment on HDFS-14768 at 9/11/19 2:21 PM: -- [~gjhkael] I tried the following code(which you said can stabilize reproduce the problem), but I can always hit the DN can not decommissioned successfully, the LOG as following. Anything I missed? Can you help? Thanks! !image-2019-09-11-22-20-50-091.png! {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0).get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock) locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); // DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem().getBlockManager() .getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = { 3, 4 }; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); // assertNull(checkFile(dfs, ecFile, 9, decommisionNodes, numDNs)); // Ensure decommissioned datanode is not automatically shutdown DFSClient client = getDfsClient(cluster.getNameNode(0), conf); assertEquals("All datanodes must be alive", numDNs, client.datanodeReport(DatanodeReportType.LIVE).length); FileChecksum fileChecksum2 = dfs.getFileChecksum(ecFile, writeBytes); Assert.assertTrue("Checksum mismatches!", fileChecksum1.equals(fileChecksum2)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null, blockGroupSize); } {code} was (Author: zhaoyim): [~gjhkael] I tried the following code(which you said can stabilize reproduce the problem), but I can always hit the DN can not decommissioned successfully, the LOG as following. Anything I missed? Can you help? Thanks! !image-2019-09-11-22-18-36-154.png! {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0).get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock) locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); // DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem().getBlockManager() .getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = { 3, 4 }; final List decommisionNodes = new ArrayList(); // add the node whic
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927253#comment-16927253 ] Surendra Singh Lilhore edited comment on HDFS-14768 at 9/11/19 4:56 AM: Thanks [~gjhkael]. {quote}you need run the code below, and you need check the block index 6 that recontruct on local path like {quote} UT should check the block corruption, we should not check it manually. If you are facing difficulties in writing UT, I can help you. Your fix idea LGTM, but you some fix is already taken care by HDFS-14699, so you need to wait for it and then need to rebase your patch. was (Author: surendrasingh): Thanks [~gjhkael]. {quote}you need run the code below, and you need check the block index 6 that recontruct on local path like {quote} UT should check the block corruption, we should not check it manually. If you are facing difficulties in writing UT, I can help you. you fix idea LGTM, but you some fix is already taken care by HDFS-14699, so you need to wait for it and then need to rebase your patch. > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: HDFS-14768.000.patch > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > for (int i = 0; i < 100; i++) { > datanodeDescriptor.incrementPendingReplicationWithoutTargets(); > } > assertEquals(dataBlocks + parityBlocks, dnLocs.length); > int[] decommNodeIndex = {3, 4}; > final List decommisionNodes = new ArrayList(); > // add the node which will be decommissioning > decommisionNodes.add(dnLocs[decommNodeIndex[0]]); > decommisionNodes.add(dnLocs[decommNodeIndex[1]]); > decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); > assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); > //assertNull(chec
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925298#comment-16925298 ] guojh edited comment on HDFS-14768 at 9/9/19 2:16 AM: -- [~surendrasingh] Thanks for you replay. you need run the code below, and you need check the block index 6 that recontruct on local path like './hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/*/current/BP-xxx/current/finalized/*/*/' {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0);DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem() .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList decommisionNodes = new ArrayList In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: HDFS-14768.000.patch > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeU
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925298#comment-16925298 ] guojh edited comment on HDFS-14768 at 9/9/19 2:14 AM: -- [~surendrasingh] Thanks for you replay. you need run the code below, and you need check the block index 6 that recontruct on local path like './hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/*/current/BP-xxx/current/finalized/*/*/' {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem() .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: HDFS-14768.000.patch > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will trigger the isal > bug. the block index6's data is corruption(all data is zero). > I write a unit test can stabilize repreduce. > {code:java} > // code placeholder > public void testFileDecommission() throws Exception { > LOG.info("Starting test testFileDecommission"); > final Path ecFile = new Path(ecDir, "testFileDecommission"); > int writeBytes = cellSize * dataBlocks; > writeStripedFile(dfs, ecFile, writeBytes); > Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); > FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); > LocatedBlocks locatedBlocks = > StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); > LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) > .get(0); > DatanodeInfo[] dnLocs = lb.getLocations(); > LocatedStripedBlock lastBlock = > (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); > DatanodeInfo[] storageInfos = lastBlock.getLocations(); > // > DatanodeDescriptor datanodeDescriptor = > cluster.getNameNode().getNamesystem() > > .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); > for (int i = 0; i < 100; i++) { > datanodeDescriptor.
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925298#comment-16925298 ] guojh edited comment on HDFS-14768 at 9/9/19 2:12 AM: -- [~surendrasingh] Thanks for you replay. you need run the code below, and you need check the block index 6 that recontruct on local path like './hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/*/current/BP-xxx/current/finalized/*/*/' {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem() .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); //assertNull(checkFile(dfs, ecFile, 9, decommisionNodes, numDNs)); // Ensure decommissioned datanode is not automatically shutdown DFSClient client = getDfsClient(cluster.getNameNode(0), conf); assertEquals("All datanodes must be alive", numDNs, client.datanodeReport(DatanodeReportType.LIVE).length); FileChecksum fileChecksum2 = dfs.getFileChecksum(ecFile, writeBytes); Assert.assertTrue("Checksum mismatches!", fileChecksum1.equals(fileChecksum2)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null, blockGroupSize); } {code} was (Author: gjhkael): [~surendrasingh] Thanks for you replay. you need run the code below, and you need check the block index 6 that recontruct on local path like './hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/*/current/BP-xxx/current/finalized/*/*/' {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); // DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem() .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); //assertNull(checkFile(dfs, ecFile, 9, decommisionNodes, numDNs)); // Ensure decommissioned datanode is not automatically shutdown DFSClient client = getDfsClient(cluster.getNameNode(0), conf); assertEquals("All datanodes must be alive", numDNs, client.datanodeReport(DatanodeReportType.LIVE).length); FileChecksum fileChecksum2 = dfs.getFileChecksum(ecFile, writeBytes); Assert.assertTrue("Checksum mismatches!", fileChecksum1.equals(fileChecksum2)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null, blockGroupSize); } {code} > In some cases, erasure blocks are corruption when they are reconstruct. > -
[jira] [Comment Edited] (HDFS-14768) In some cases, erasure blocks are corruption when they are reconstruct.
[ https://issues.apache.org/jira/browse/HDFS-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925298#comment-16925298 ] guojh edited comment on HDFS-14768 at 9/9/19 2:12 AM: -- [~surendrasingh] Thanks for you replay. you need run the code below, and you need check the block index 6 that recontruct on local path like './hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/*/current/BP-xxx/current/finalized/*/*/' {code:java} // code placeholder {code} was (Author: gjhkael): [~surendrasingh] Thanks for you replay. you need run the code below, and you need check the block index 6 that recontruct on local path like './hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/*/current/BP-xxx/current/finalized/*/*/' {code:java} // code placeholder public void testFileDecommission() throws Exception { LOG.info("Starting test testFileDecommission"); final Path ecFile = new Path(ecDir, "testFileDecommission"); int writeBytes = cellSize * dataBlocks; writeStripedFile(dfs, ecFile, writeBytes); Assert.assertEquals(0, bm.numOfUnderReplicatedBlocks()); FileChecksum fileChecksum1 = dfs.getFileChecksum(ecFile, writeBytes); LocatedBlocks locatedBlocks = StripedFileTestUtil.getLocatedBlocks(ecFile, dfs); LocatedBlock lb = dfs.getClient().getLocatedBlocks(ecFile.toString(), 0) .get(0); DatanodeInfo[] dnLocs = lb.getLocations(); LocatedStripedBlock lastBlock = (LocatedStripedBlock)locatedBlocks.getLastLocatedBlock(); DatanodeInfo[] storageInfos = lastBlock.getLocations(); DatanodeDescriptor datanodeDescriptor = cluster.getNameNode().getNamesystem() .getBlockManager().getDatanodeManager().getDatanode(storageInfos[6].getDatanodeUuid()); for (int i = 0; i < 100; i++) { datanodeDescriptor.incrementPendingReplicationWithoutTargets(); } assertEquals(dataBlocks + parityBlocks, dnLocs.length); int[] decommNodeIndex = {3, 4}; final List decommisionNodes = new ArrayList(); // add the node which will be decommissioning decommisionNodes.add(dnLocs[decommNodeIndex[0]]); decommisionNodes.add(dnLocs[decommNodeIndex[1]]); decommissionNode(0, decommisionNodes, AdminStates.DECOMMISSIONED); assertEquals(decommisionNodes.size(), fsn.getNumDecomLiveDataNodes()); //assertNull(checkFile(dfs, ecFile, 9, decommisionNodes, numDNs)); // Ensure decommissioned datanode is not automatically shutdown DFSClient client = getDfsClient(cluster.getNameNode(0), conf); assertEquals("All datanodes must be alive", numDNs, client.datanodeReport(DatanodeReportType.LIVE).length); FileChecksum fileChecksum2 = dfs.getFileChecksum(ecFile, writeBytes); Assert.assertTrue("Checksum mismatches!", fileChecksum1.equals(fileChecksum2)); StripedFileTestUtil.checkData(dfs, ecFile, writeBytes, decommisionNodes, null, blockGroupSize); } {code} > In some cases, erasure blocks are corruption when they are reconstruct. > > > Key: HDFS-14768 > URL: https://issues.apache.org/jira/browse/HDFS-14768 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, erasure-coding, hdfs, namenode >Affects Versions: 3.0.2 >Reporter: guojh >Assignee: guojh >Priority: Major > Labels: patch > Fix For: 3.3.0 > > Attachments: HDFS-14768.000.patch > > > Policy is RS-6-3-1024K, version is hadoop 3.0.2; > We suppose a file's block Index is [0,1,2,3,4,5,6,7,8], And decommission > index[3,4], increase the index 6 datanode's > pendingReplicationWithoutTargets that make it large than > replicationStreamsHardLimit(we set 14). Then, After the method > chooseSourceDatanodes of BlockMananger, the liveBlockIndices is > [0,1,2,3,4,5,7,8], Block Counter is, Live:7, Decommission:2. > In method scheduleReconstruction of BlockManager, the additionalReplRequired > is 9 - 7 = 2. After Namenode choose two target Datanode, will assign a > erasureCode task to target datanode. > When datanode get the task will build targetIndices from liveBlockIndices > and target length. the code is blow. > {code:java} > // code placeholder > targetIndices = new short[targets.length]; > private void initTargetIndices() { > BitSet bitset = reconstructor.getLiveBitSet(); > int m = 0; hasValidTargets = false; > for (int i = 0; i < dataBlkNum + parityBlkNum; i++) { > if (!bitset.get) { > if (reconstructor.getBlockLen > 0) { > if (m < targets.length) { > targetIndices[m++] = (short)i; > hasValidTargets = true; > } > } > } > } > {code} > targetIndices[0]=6, and targetIndices[1] is aways 0 from initial value. > The StripedReader is aways create reader from first 6 index block, and is > [0,1,2,3,4,5] > Use the index [0,1,2,3,4,5] to build target index[6,0] will