[ 
https://issues.apache.org/jira/browse/SOLR-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343551#comment-14343551
 ] 

Shalin Shekhar Mangar edited comment on SOLR-7109 at 3/2/15 6:47 PM:
---------------------------------------------------------------------

Here's a patch which uses ZooKeeper 'multi' transactions to make sure that the 
LIR state can be set only when the requesting leader node is still alive. This 
ensures that regardless of how long the network partition lasts (long GC, 
whatever), the node setting the LIR state must be the leader or else the LIR 
state cannot be set.

Initially I attempted to use the shard leader path as the 'exists' check in the 
'multi' command but this doesn't work because the leader path is always created 
fresh which means that it's version is always 0 and the check always succeeds 
regardless of who the current leader is. This is why we must use the election's 
leader sequence path.

This is just a first cut of this approach. I intend to refactor some of these 
LIR methods -- they have become too big. I will also write a test which 
exercises these new transactional semantics and reproduces the failure.

Edit - I also remove the replicaUrl parameter from 
ZkController.ensureReplicaInLeaderInitiatedRecovery because replicaProps were 
already being passed as a parameter and the replica url can be derived from it.


was (Author: shalinmangar):
Here's a patch which uses ZooKeeper 'multi' transactions to make sure that the 
LIR state can be set only when the requesting leader node is still alive. This 
ensures that regardless of how long the network partition lasts (long GC, 
whatever), the node setting the LIR state must be the leader or else the LIR 
state cannot be set.

Initially I attempted to use the shard leader path as the 'exists' check in the 
'multi' command but this doesn't work because the leader path is always created 
fresh which means that it's version is always 0 and the check always succeeds 
regardless of who the current leader is. This is why we must use the election's 
leader sequence path.

This is just a first cut of this approach. I intend to refactor some of these 
LIR methods -- they have become too big. I will also write a test which 
exercises these new transactional semantics and reproduces the failure.

> Indexing threads stuck during network partition can put leader into down state
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-7109
>                 URL: https://issues.apache.org/jira/browse/SOLR-7109
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.10.3, 5.0
>            Reporter: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.1
>
>         Attachments: SOLR-7109.patch
>
>
> I found this recently while running some Jepsen tests. I found that some 
> threads get stuck on zk operations for a long time in 
> ZkController.updateLeaderInitiatedRecoveryState method and when they wake up 
> they go ahead with setting the LIR state to down. But in the mean time, new 
> leader has been elected and sometimes you'd get into a state where the leader 
> itself is put into recovery causing the shard to reject all writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to