[ 
https://issues.apache.org/jira/browse/SOLR-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14998758#comment-14998758
 ] 

Mark Miller commented on SOLR-7989:
-----------------------------------

Why do we use the stale clusterstate to see if we are already active and 
prevent publishing active if we are not? Doesn't that just allow for unknown 
races? Shouldn't we just publish active regardless?

> Down replica elected leader, stays down after successful election
> -----------------------------------------------------------------
>
>                 Key: SOLR-7989
>                 URL: https://issues.apache.org/jira/browse/SOLR-7989
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Noble Paul
>             Fix For: 5.4, Trunk
>
>         Attachments: DownLeaderTest.java, DownLeaderTest.java, 
> SOLR-7989.patch, SOLR-7989.patch, SOLR-7989.patch, SOLR-7989.patch, 
> SOLR-8233.patch
>
>
> It is possible that a down replica gets elected as a leader, and that it 
> stays down after the election.
> Here's how I hit upon this:
> * There are 3 replicas: leader, notleader0, notleader1
> * Introduced network partition to isolate notleader0, notleader1 from leader 
> (leader puts these two in LIR via zk).
> * Kill leader, remove partition. Now leader is dead, and both of notleader0 
> and notleader1 are down. There is no leader.
> * Remove LIR znodes in zk.
> * Wait a while, and there happens a (flawed?) leader election.
> * Finally, the state is such that one of notleader0 or notleader1 (which were 
> down before) become leader, but stays down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to