[jira] [Created] (HDFS-17798) The problem that bad replicas in the mini cluster cannot be automatically replicated

kuper (Jira) Wed, 18 Jun 2025 20:53:13 -0700

kuper created HDFS-17798:
----------------------------

             Summary: The problem that bad replicas in the mini cluster cannot 
be automatically replicated
                 Key: HDFS-17798
                 URL: https://issues.apache.org/jira/browse/HDFS-17798
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: block placement
    Affects Versions: 3.3.6
            Reporter: kuper
            Assignee: kuper



* In a 3-datanode cluster with a 3-replica block, if one replica on a node 
becomes corrupted (and this corruption did not occur during the write process), 
it will result in:
 * 
 ** The corrupted replica cannot be removed from the damaged node.
 ** Due to the missing replica, replication reconstruction tasks will 
continuously attempt to replicate the corrupted replica.
 ** However, during reconstruction, nodes already hosting a replica of this 
block are excluded—meaning all 3 datanodes are excluded.
 ** This prevents the selection of a suitable target node for replication, 
eventually creating a {*}vicious cycle{*}.

*reproduction*
 * Or execute in the 3datanode cluster in the following order
 ** Find a normal block with three replicas and destroy the replica file of one 
of its Datanodes
 ** 
When waiting for the datanode disk scan cycle, it will be found that there is a 
damaged copy and it cannot be rebuilt by other copies
 * In the TestBlockManager add 
testMiniClusterCannotReconstructionWhileReplicaAnomaly

 
{code:java}
  @Test(timeout = 60000)
  public void testMiniClusterCannotReconstructionWhileReplicaAnomaly() 
      throws IOException, InterruptedException, TimeoutException {
    Configuration conf = new HdfsConfiguration();
    conf.setInt("dfs.datanode.directoryscan.interval", 
DN_DIRECTORYSCAN_INTERVAL);
    conf.setInt("dfs.namenode.replication.interval", 1);
    conf.setInt("dfs.heartbeat.interval", 1);
    String src = "/test-reconstruction";
    Path file = new Path(src);
    MiniDFSCluster cluster = new 
MiniDFSCluster.Builder(conf).numDataNodes(3).build();
    try {
      cluster.waitActive();
      FSNamesystem fsn = cluster.getNamesystem();
      BlockManager bm = fsn.getBlockManager();
      
      FSDataOutputStream out = null;
      FileSystem fs = cluster.getFileSystem();
      try {
        out = fs.create(file);
        for (int i = 0; i < 1024 * 1; i++) {
          out.write(i);
        }
        out.hflush();
      } finally {
        IOUtils.closeStream(out);
      }
      
      FSDataInputStream in = null;
      ExtendedBlock oldBlock = null;
      try {
        in = fs.open(file);
        oldBlock = DFSTestUtil.getAllBlocks(in).get(0).getBlock();
      } finally {
        IOUtils.closeStream(in);
      }
      DataNode dn = cluster.getDataNodes().get(0);
      String blockPath = 
dn.getFSDataset().getBlockLocalPathInfo(oldBlock).getBlockPath();
      String metaBlockPath = 
dn.getFSDataset().getBlockLocalPathInfo(oldBlock).getMetaPath();
      Files.write(Paths.get(blockPath), Collections.emptyList());
      Files.write(Paths.get(metaBlockPath), Collections.emptyList());
      cluster.restartDataNode(0, true);
      cluster.waitDatanodeConnectedToActive(dn, 60000);
      while(!dn.isDatanodeFullyStarted()) {
        Thread.sleep(1000);
      }
      Thread.sleep(DN_DIRECTORYSCAN_INTERVAL * 1000);
      cluster.triggerBlockReports();
      BlockInfo bi = bm.getStoredBlock(oldBlock.getLocalBlock());
      assertTrue(bm.isNeededReconstruction(bi,
          bm.countNodes(bi, cluster.getNamesystem().isInStartupSafeMode())));
      BlockReconstructionWork reconstructionWork = null;
      fsn.readLock();
      try {
        reconstructionWork = bm.scheduleReconstruction(bi, 3);
      } finally {
        fsn.readUnlock();
      }
      assertNotNull(reconstructionWork);
      assertEquals(reconstructionWork.getContainingNodes().size(), 3);
     
    } finally {
      if (cluster != null) {
        cluster.shutdown();
      }
    }
  } {code}
 
 * Or execute in the 3datanode cluster in the following order
 **  

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-17798) The problem that bad replicas in the mini cluster cannot be automatically replicated

Reply via email to