It's worth a try to take down your entire cluster. Bring one machine
back up at a time. There _may_ be something like a 3 minute wait
before each of the replicas on that machine come up, the leader
election process has a 180 second delay before the replicas on that
node take over leadership to wait for the last known good leader to
come up.

Continue bringing one node up at a time and wait patiently  until all
the replicas on it are green and until you have a leader for each
shard elected. Bringing up the rest of the Solr nodes should be
quicker then.

Be sure to sequence things so you have known good Solr nodes come up
first for the shard that's wonky. By that I mean that the first node
you bring up for the leaderless shard should be the one with the best
chance of having a totally OK index.


Let's claim that the above does bring up a leader for each shard. If
you still have a replica that refuses to come up, use the
DELETEREPLICA command to remove it. Just for insurance, I'd take the
Solr node down after the DELETEREPLICA and remove the entire core
directory for the replica that didn't come up. Then restart the node
and use the ADDREPLICA collections API command to put it back.

If none of that works, you could try hand-editing the state.json file
and _make_ one of the shards a leader (I'd do this with the Solr nodes
down), but that's not for the faint of heart.

Best,
Erick

On Wed, Feb 1, 2017 at 1:57 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
> Sounds similar to a thread last year:
> http://lucene.472066.n3.nabble.com/Node-not-recovering-leader-elections-not-occuring-tp4287819p4287866.html
>
>
>
> On 2/1/17, 7:49 AM, "tedsolr" <tsm...@sciquest.com> wrote:
>
>     I have version 5.2.1. Short of an upgrade, are there any remedies?
>
>
>     Erick Erickson wrote
>     > What version of Solr? since 5.4 there's been a FORCELEADER colelctions
>     > API call that might help.
>     >
>     > I'd run it with the newly added replicas offline. you only want it to
>     > have good replicas to choose from.
>     >
>     > Best,
>     > Erick
>     >
>     > On Wed, Feb 1, 2017 at 6:48 AM, tedsolr &lt;
>
>     > tsmith@
>
>     > &gt; wrote:
>     >> Update! I did find an error:
>     >>
>     >> 2017-02-01 09:23:22.673 ERROR org.apache.solr.common.SolrException
>     >> :org.apache.solr.common.SolrException: Error getting leader from zk for
>     >> shard shard1
>     >> ....
>     >> Caused by: org.apache.solr.common.SolrException: Could not get leader
>     >> props
>     >>         at
>     >> 
> org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:1040)
>     >>         at
>     >> 
> org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:1004)
>     >>         at
>     >> org.apache.solr.cloud.ZkController.getLeader(ZkController.java:960)
>     >>         ... 14 more
>     >> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>     >> KeeperErrorCode = NoNode for /collections/colname/leaders/shard1
>     >>         at
>     >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>     >>
>     >> When I view the cluster status I see that this shard does not have a
>     >> leader.
>     >> So it appears I need to force the leader designation to the "active"
>     >> replica. How do I do that?
>     >>
>     >>
>     >>
>     >> --
>     >> View this message in context:
>     >> 
> http://lucene.472066.n3.nabble.com/Collection-will-not-replicate-tp4318260p4318265.html
>     >> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>
>
>     --
>     View this message in context: 
> http://lucene.472066.n3.nabble.com/Collection-will-not-replicate-tp4318260p4318283.html
>     Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Reply via email to