Re: SolrCloud shard leader elections - Altering zookeeper sequence numbers

Erick Erickson Tue, 13 Jan 2015 08:00:15 -0800

SolrCloud is intended to work in the rolling restart case...

Index size, segment counts, segment names can (and will)
be different on different replicas of the same shard without
anything being amiss. Commits (hard) happen at different
times across the replicas in a shard. Merging logic kicks in
and may (will eventually in all probability) pick different
segments to merge, with varying numbers of deleted docs
that get purged etc.


The numFound reported on a q=*:*&distrib=false, or looking at the
core in the admin screen for the replicas in question and noting
numDocs should be identical though if
1> you've issued a hard commit with openSearcher=true _or_
     a soft commit.
2> you haven't been indexing or haven't issued a commit
     as in <1> since you started looking.

Best,
Erick

On Tue, Jan 13, 2015 at 4:20 AM, Zisis Tachtsidis <zist...@runbox.com> wrote:
> Daniel Collins wrote
>> Is it important where your leader is?  If you just want to minimize
>> leadership changes during rolling re-start, then you could restart in the
>> opposite order (S3, S2, S1).  That would give only 1 transition, but the
>> end result would be a leader on S2 instead of S1 (not sure if that
>> important to you or not).  I know its not a "fix", but it might be a
>> workaround until the whole leadership moving is done?
>
> I think that rolling restarting the machines in the opposite order
> (S3,S2,S1) will result in S3 being the leader. It's a valid approach but
> shouldn't I have to revert to the original order (S1,S2,S3) to achieve the
> same result in the following rolling restart? This includes operational
> costs and complexity that I want to avoid.
>
>
> Erick Erickson wrote
>>> Just skimming, but the problem here that I ran into was with the
>>> listeners. Each _Solr_ instance out there is listening to one of the
>>> ephemeral nodes (the "one in front"). So deleting a node does _not_
>>> change which ephemeral node the associated Solr instance is listening
>>> to.
>>>
>>> So, for instance, when you delete S2..n-000001 and re-add it, S2 is
>>> still looking at S1....n-000000 and will continue looking at
>>> S1...n-000000 until S1....n-000000 is deleted.
>>>
>>> Deleting S2..n-000001 will wake up S3 though, which should now be
>>> looking at S1....n-0000000. Now you have two Solr listeners looking at
>>> the same ephemeral node. The key is that deleting S2...n-000001 does
>>> _not_ wake up S2, just any solr instance that has a watch on the
>>> associated ephemeral node.
>
> Thanks for the info Erick. I wasn't aware of this "linked-list" listeners
> structure between the zk nodes. Based on what you've said though I've
> changed my implementation a bit and it seems to be working at first glance.
> Of course it's not reliable yet but it looks promising.
>
> My original attempt
>> S1:-n_0000000000 (no code running here)
>> S2:-n_0000000004 (code deleting zknode -n_0000000001 and creating
>> -n_0000000004)
>> S3:-n_0000000003 (code deleting zknode -n_0000000002 and creating
>> -n_0000000003)
>
> has been changed to
> S1:-n_0000000000 (no code running here)
> S2:-n_0000000003 (code deleting zknode -n_0000000001 and creating
> -n_0000000003 using EPHEMERAL_SEQUENTIAL)
> S3:-n_0000000002 (no code running here)
>
> Once S1 is shutdown S3 becomes leader since it listens to S1 now according
> to what you've said
>
> The original reason I pursued this "minimize leadership changes" quest was
> that it _could_ lead to "data loss" in some scenarios. I'm not entirely sure
> though and you could correct me on this and but I'm explaining myself.
>
> If you have incoming indexing requests during a rolling restart, could there
> be a case during the "current leader shutdown" where the "leader-to-be-node"
> could not have the time to sync with the
> "current-leader-that-shut-downs-node" in which case everyone will now sync
> to the new leader thus missing some updates. I've seen an installation
> having different index sizes in each replica that deteriorated over time.
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-shard-leader-elections-Altering-zookeeper-sequence-numbers-tp4178973p4179147.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud shard leader elections - Altering zookeeper sequence numbers

Reply via email to