[ https://issues.apache.org/jira/browse/SOLR-8069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14866031#comment-14866031 ]
Mark Miller commented on SOLR-8069: ----------------------------------- bq. thus I prefer the simple logic of "do this action only if our zookeeper session state is exactly what it was when we decided to do it". Anyhow, this is probably beyond the scope of this JIRA. I don't see an easy way to do that in this case. Almost all the solutions that fit with the code have the exact same holes / races. I think the local leader check around getting the leader context is the strongest thing I can think of so far other than adding further defensive checks. I don't know that much more is needed though. If the context returned is from the leader, great, its zkparentversion will will match. If the context is somehow not the right one, it won't match. We get a context and only if it's the context for the leader in ZK do we do anything rather than just if the context has a node in line. I'd say that is a pretty strong improvement. This should only work the node is a valid leader by it's local state and by ZooKeeper. > Leader Initiated Recovery can put the replica with the latest data into LIR > and a shard will have no leader even on restart. > ---------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-8069 > URL: https://issues.apache.org/jira/browse/SOLR-8069 > Project: Solr > Issue Type: Bug > Reporter: Mark Miller > Attachments: SOLR-8069.patch, SOLR-8069.patch > > > I've seen this twice now. Need to work on a test. > When some issues hit all the replicas at once, you can end up in a situation > where the rightful leader was put or put itself into LIR. Even on restart, > this rightful leader won't take leadership and you have to manually clear the > LIR nodes. > It seems that if all the replicas participate in election on startup, LIR > should just be cleared. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org