[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927311#comment-15927311 ]
Andrew Wang commented on HDFS-10530: ------------------------------------ Hi Manoj, thanks for working on this, hoping you can help me understand the current behavior, also a few review comments too: In the unit test, looks like we start a cluster with 6 racks, write a 6+3 file, add 3 more hosts, then wait for parity blocks to be written to these 3 new nodes. * Are these necessarily the parity blocks, or could they be any of the blocks that are co-located on the first 6 racks? * Also, does this happen via EC reconstruction, or do we simply copy the blocks over to the new racks? If it's the first, we should file a follow-on to do the second when possible. * Is the BPP violated before entering the waitFor? If so we should assert that. This may require pausing reconstruction work and resuming later. A few other notes: * "block recovery" refers to specifically the process of recovering from a DN failure while writing. I'm guessing this is actually more like "replication" or since it's EC, "reconstruction" work? * Do you think TestBPPRackFaultTolerant needs any additional unit tests along these lines? {code} cluster.startDataNodes(conf, 3, true, null, null, new String[]{"host3", "host4", "host5"}, null); {code} Looks like these have the same names as the initial DNs as Takanobu noted. Might be nice to specify the racks too to be explicit. {code} lb = DFSTestUtil.getAllBlocks(dfs, testFileUnsatisfied).get(0); blockInfo = bm.getStoredBlock(lb.getBlock().getLocalBlock()); dumpStoragesFor(blockInfo); // But, there will not be any block placement changes yet assertFalse("Block group of testFileUnsatisfied should not be placement" + " policy satisfied", bm.isPlacementPolicySatisfied(blockInfo)); {code} If we later enhance the NN to automatically fix up misplaced EC blocks, this assert will be flaky. Maybe add a comment? Or we could pause reconstruction work, add the DNs, and resume after the assert. > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > ------------------------------------------------------------------------------------------------ > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Reporter: Rui Gao > Assignee: Manoj Govindassamy > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch, > HDFS-10530.3.patch, HDFS-10530.4.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > âBlockPlacementPolicyRackFaultTolerant.javaâ, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org