[ 
https://issues.apache.org/jira/browse/HDFS-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766239#comment-17766239
 ] 

ASF GitHub Bot commented on HDFS-17198:
---------------------------------------

KeeProMise opened a new pull request, #6096:
URL: https://github.com/apache/hadoop/pull/6096

   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 
'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   In the original implementation, when each router reports nn status at 
different times, the nn status is the status reported by majority routers, for 
example:
   router1 -> nn0:active dateModified:1
   
   router2 -> nn0:active dateModified:2
   
   router3 -> nn0:active dateModified:3
   
   router0 -> nn0:standby dateModified:4
   
   Then, the status of nn0 is active, because majority routers report that nn0 
is active.
   
   If majority routers report nn status at the same time, for example:
   (record1) router1 -> nn0:active dateModified:1
   
   (record2) router2 -> nn0:active dateModified:1
   
   (record3) router3 -> nn0:active dateModified:1
   
   (record4) router0 -> nn0:standbydateModified:2
   
   Then the state of nn0 is standby, but We expect the status of nn0 is active
   
   This bug is because the above record is put into the Treeset in the method 
getRepresentativeQuorum. Since record1,2,3 have the same dateModified, there 
will only be one record in the final treeset of this method, so this method 
thinks that this nn is standby, because record4 newer
   
   see: https://issues.apache.org/jira/browse/HDFS-17198
   
   ### How was this patch tested?
   my unit test testRegistrationMajorityQuorumEqDateModified
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> RBF: fix bug of getRepresentativeQuorum when records have same dateModified
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-17198
>                 URL: https://issues.apache.org/jira/browse/HDFS-17198
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Jian Zhang
>            Assignee: Jian Zhang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HDFS-17198.v001.patch
>
>
> h2. *Bug description*
> In the original implementation, when each router reports nn status at 
> different times, the nn status is the status reported by majority routers, 
> for example:
> router1 -> nn0:active dateModified:1
> router2 -> nn0:active dateModified:2
> router3 -> nn0:active dateModified:3
> router0 -> nn0:standby dateModified:4
> Then, the status of nn0 is active, because majority routers report that nn0 
> is active.
> If majority routers report nn status at the same time, for example:
> (record1) router1 -> nn0:active dateModified:1
> (record2) router2 -> nn0:active dateModified:1
> (record3) router3 -> nn0:active dateModified:1
> (record4) router0 -> nn0:standbydateModified:2
> Then the state of nn0 is standby, but We expect the status of nn0 is active
> This bug is because the above record is put into the Treeset in the method 
> getRepresentativeQuorum. Since record1,2,3 have the same dateModified, there 
> will only be one record in the final treeset of this method, so this method 
> thinks that this nn is standby, because record4 newer
> h2. *How to reproduce*
> Running my unit test testRegistrationMajorityQuorumEqDateModified, but using 
> the original code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to