[jira] [Commented] (SOLR-6511) Fencepost error in LeaderInitiatedRecoveryThread

Timothy Potter (JIRA) Fri, 12 Sep 2014 12:20:48 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131972#comment-14131972
 ]


Timothy Potter commented on SOLR-6511:
--------------------------------------

bq. what ought to happen here is that replica2 sends a message back saying "no 
need, I'm the leader, I'll take it from here, thanks". But because of the 
fencepost error, the message to replica2 is never actually sent, and replica1 
then writes replica2's state as DOWN into the LIRT zk node

The more I think about this, I don't see how the fencepost error gets hit here? 
maxTries will be 120 if replica1 is setting replica2 to down

So I think the real fix is to do what Alan suggests - have the new leader 
respond with: no need, I'm the leader, I'll take it from here, thanks

The patch I posted earlier has some good improvements in it, but I think we 
need a unit test that proves the code works correctly for the scenario 
described above.

> Fencepost error in LeaderInitiatedRecoveryThread
> ------------------------------------------------
>
>                 Key: SOLR-6511
>                 URL: https://issues.apache.org/jira/browse/SOLR-6511
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Alan Woodward
>            Assignee: Timothy Potter
>         Attachments: SOLR-6511.patch
>
>
> At line 106:
> {code}
>     while (continueTrying && ++tries < maxTries) {
> {code}
> should be
> {code}
>     while (continueTrying && ++tries <= maxTries) {
> {code}
> This is only a problem when called from DistributedUpdateProcessor, as it can 
> have maxTries set to 1, which means the loop is never actually run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6511) Fencepost error in LeaderInitiatedRecoveryThread

Reply via email to