Danny Becker created HDFS-17477: ----------------------------------- Summary: IncrementalBlockReport race condition additional edge cases Key: HDFS-17477 URL: https://issues.apache.org/jira/browse/HDFS-17477 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover, ha, namenode Affects Versions: 3.3.6, 3.3.4, 3.3.5 Reporter: Danny Becker
HDFS-17453 fixes a race condition between IncrementalBlockReports (IBR) and the Edit Log Tailer which can cause the Standby NameNode (SNN) to incorrectly mark blocks as corrupt when it transitions to Active. There are a few edge cases that HDFS-17453 does not cover. For Example: 1. SNN1 loads the edits for b1gs1 and b1gs2. 2. DN1 reports b1gs1 to SNN1, so it gets queued for later processing. 3. DN1 reports b1gs2 to SNN1 so it gets added to the blocks map. 4. SNN1 transitions to Active (ANN1). 5. ANN1 processes the pending DN message queue and marks DN1->b1gs1 as corrupt because it was still in the queue. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org