Jian Zhang created HDFS-17198: --------------------------------- Summary: RBF: fix bug of getRepresentativeQuorum Key: HDFS-17198 URL: https://issues.apache.org/jira/browse/HDFS-17198 Project: Hadoop HDFS Issue Type: Bug Reporter: Jian Zhang
h2. *Bug description* In the original implementation, when each router reports nn status at different times, the nn status is the status reported by majority routers, for example: router1 -> nn0:active dateModified:1 router2 -> nn0:active dateModified:2 router3 -> nn0:active dateModified:3 router0 -> nn0:standby dateModified:4 Then, the status of nn0 is active, because majority routers report that nn0 is active. If majority routers report nn status at the same time, for example: (record1) router1 -> nn0:active dateModified:1 (record2) router2 -> nn0:active dateModified:1 (record3) router3 -> nn0:active dateModified:1 (record4) router0 -> nn0:standbydateModified:2 Then the state of nn0 is standby, but We expect the status of nn0 is active This bug is because the above record is put into the Treeset in the method getRepresentativeQuorum. Since record1,2,3 have the same dateModified, there will only be one record in the final treeset of this method, so this method thinks that this nn is standby, because record4 newer h2. *How to reproduce* Running my unit test testRegistrationMajorityQuorumEqDateModified, but using the original code -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org