[ https://issues.apache.org/jira/browse/SOLR-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848693#comment-13848693 ]
Mark Miller edited comment on SOLR-5552 at 12/15/13 9:50 PM: ------------------------------------------------------------- I think what we want to do here is look at having the core actually accept http requests before it registers and enters leader election - any issues we find doing this should be issues anyway, as we already have this case on a ZooKeeper expiration and recovery. was (Author: markrmil...@gmail.com): I think we want to do here is look at having the core actually accept http requests before it registers and enters leader election - any issues we find there should be issues anyway, as we already have this case on a ZooKeeper expiration and recovery. > Leader recovery process can select the wrong leader if all replicas for a > shard are down and trying to recover > -------------------------------------------------------------------------------------------------------------- > > Key: SOLR-5552 > URL: https://issues.apache.org/jira/browse/SOLR-5552 > Project: Solr > Issue Type: Bug > Components: SolrCloud > Reporter: Timothy Potter > Labels: leader, recovery > Attachments: SOLR-5552.patch > > > One particular issue that leads to out-of-sync shards, related to SOLR-4260 > Here's what I know so far, which admittedly isn't much: > As cloud85 (replica before it crashed) is initializing, it enters the wait > process in ShardLeaderElectionContext#waitForReplicasToComeUp; this is > expected and a good thing. > Some short amount of time in the future, cloud84 (leader before it crashed) > begins initializing and gets to a point where it adds itself as a possible > leader for the shard (by creating a znode under > /collections/cloud/leaders_elect/shard1/election), which leads to cloud85 > being able to return from waitForReplicasToComeUp and try to determine who > should be the leader. > cloud85 then tries to run the SyncStrategy, which can never work because in > this scenario the Jetty HTTP listener is not active yet on either node, so > all replication work that uses HTTP requests fails on both nodes ... PeerSync > treats these failures as indicators that the other replicas in the shard are > unavailable (or whatever) and assumes success. Here's the log message: > 2013-12-11 11:43:25,936 [coreLoadExecutor-3-thread-1] WARN > solr.update.PeerSync - PeerSync: core=cloud_shard1_replica1 > url=http://cloud85:8985/solr couldn't connect to > http://cloud84:8984/solr/cloud_shard1_replica2/, counting as success > The Jetty HTTP listener doesn't start accepting connections until long after > this process has completed and already selected the wrong leader. > From what I can see, we seem to have a leader recovery process that is based > partly on HTTP requests to the other nodes, but the HTTP listener on those > nodes isn't active yet. We need a leader recovery process that doesn't rely > on HTTP requests. Perhaps, leader recovery for a shard w/o a current leader > may need to work differently than leader election in a shard that has > replicas that can respond to HTTP requests? All of what I'm seeing makes > perfect sense for leader election when there are active replicas and the > current leader fails. > All this aside, I'm not asserting that this is the only cause for the > out-of-sync issues reported in this ticket, but it definitely seems like it > could happen in a real cluster. -- This message was sent by Atlassian JIRA (v6.1.4#6159) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org