Re: Cluster down for long time after zookeeper disconnection

danny teichthal Tue, 11 Aug 2015 01:10:34 -0700

1. Erik, thanks,  I agree that it is really serious, but I think that the 3
minutes on this case were not mandatory.
On my case it was a deadlock, which smells like some kind of bug.
One replica is waiting for other to come up, before it takes leadership,
while the other is waiting for the election results.
If I will be able to reproduce it on 5.2.1, is it legitimate to file a JIRA
issue for that?


2. Regarding session timeouts, there's something about configuration that I
don't understand.
 If zkClientTimeout is set to 30 seconds, how come see in the log that
session expired after ~50 seconds.
Maybe I have a mismatch between zookeeper and solr configuration?

3. Resuming the question of leaderVoteWait parameter, I have seen in a few
threads that it may be reduced to a minimum.
I'm not clear about the full meaning, but I understand that it is meant to
prevent lose of update on cluster startup.
Can anyone confirm/clarify that?




Links for leaderVoteWait:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3ccajt9wnhivirpn79kttcn8ekafevhhmqwkfl-+i16kbz0ogl...@mail.gmail.com%3E

http://qnalist.com/questions/4812859/waitforleadertoseedownstate-when-leader-is-down

Relevant part from My zookeeper conf:
tickTime=2000
initLimit=10
syncLimit=5



On Tue, Aug 11, 2015 at 1:06 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Not that I know of. With ZK as the "one source of truth", dropping below
> quorum
> is Really Serious, so having to wait 3 minutes or so for action to be
> taken is the
> fallback.
>
> Best,
> Erick
>
> On Mon, Aug 10, 2015 at 1:34 PM, danny teichthal <dannyt...@gmail.com>
> wrote:
> > Erick, I assume you are referring to zkClientTimeout, it is set to 30
> > seconds. I also see these messages on Solr side:
> >  "Client session timed out, have not heard from server in 48865ms for
> > sessionid 0x44efbb91b5f0001, closing socket connection and attempting
> > reconnect".
> > So, I'm not sure what was the actual disconnection duration time, but it
> > could have been up to a minute.
> > We are working on finding the network issues root cause, but assuming
> > disconnections will always occur, are there any other options to overcome
> > this issues?
> >
> >
> >
> > On Mon, Aug 10, 2015 at 11:18 PM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> >> I didn't see the zk timeout you set (just skimmed). But if your
> Zookeeper
> >> was
> >> down _very_ termporarily, it may suffice to up the ZK timeout. The
> default
> >> in the 10.4 time-frame (if I remember correctly) was 15 seconds which
> has
> >> proven to be too short in many circumstances.
> >>
> >> Of course if your ZK was down for minutest this wouldn't help.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Aug 10, 2015 at 1:06 PM, danny teichthal <dannyt...@gmail.com>
> >> wrote:
> >> > Hi Alexander ,
> >> > Thanks for your reply, I looked at the release notes.
> >> > There is one bug fix - SOLR-7503
> >> > <https://issues.apache.org/jira/browse/SOLR-7503> – register cores
> >> > asynchronously.
> >> > It may reduce the registration time since it is done on parallel, but
> >> > still, 3 minutes (leaderVoteWait) is a long time to recover from a few
> >> > seconds of disconnection.
> >> >
> >> > Except from that one I don't see any bug fix that addresses the same
> >> > problem.
> >> > I am able to reproduce it on 4.10.4 pretty easily, I will also try it
> >> with
> >> > 5.2.1 and see if it reproduces.
> >> >
> >> > Anyway, since migrating to 5.2.1 is not an option for me in the short
> >> term,
> >> > I'm left with the question if reducing leaderVoteWait may help here,
> and
> >> > what may be the consequences.
> >> > If i understand correctly, there might be a chance of losing updates
> that
> >> > were made on leader.
> >> > From my side it is a lot worse to lose availability for 3 minutes.
> >> >
> >> > I would really appreciate a feedback on this.
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Aug 10, 2015 at 6:55 PM, Alexandre Rafalovitch <
> >> arafa...@gmail.com>
> >> > wrote:
> >> >
> >> >> Did you look at release notes for Solr versions after your own?
> >> >>
> >> >> I am pretty sure some similar things were identified and/or resolved
> >> >> for 5.x. It may not help if you cannot migrate, but would at least
> >> >> give a confirmation and maybe workaround on what you are facing.
> >> >>
> >> >> Regards,
> >> >>    Alex.
> >> >> ----
> >> >> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> >> >> http://www.solr-start.com/
> >> >>
> >> >>
> >> >> On 10 August 2015 at 11:37, danny teichthal <dannyt...@gmail.com>
> >> wrote:
> >> >> > Hi,
> >> >> > We are using Solr cloud with solr 4.10.4.
> >> >> > On the passed week we encountered a problem where all of our
> servers
> >> >> > disconnected from zookeeper cluster.
> >> >> > This might be ok, the problem is that after reconnecting to
> zookeeper
> >> it
> >> >> > looks like for every collection both replicas do not have a leader
> and
> >> >> are
> >> >> > stuck in some kind of a deadlock for a few minutes.
> >> >> >
> >> >> > From what we understand:
> >> >> > One of the replicas assume it ill be the leader and at some point
> >> >> starting
> >> >> > to wait on leaderVoteWait, which is by default 3 minutes.
> >> >> > The other replica is stuck on this part of code for a few minutes:
> >> >> >  at
> >> >>
> org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:957)
> >> >> >         at
> >> >> >
> >> org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:921)
> >> >> >         at
> >> >> >
> >> >>
> >>
> org.apache.solr.cloud.ZkController.waitForLeaderToSeeDownState(ZkController.java:1521)
> >> >> >         at
> >> >> >
> >> >>
> >>
> org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:392)
> >> >> >
> >> >> > Looks like replica 1 waits for a leader to be registered in the
> >> >> zookeeper,
> >> >> > but replica 2 is waiting for replica 1.
> >> >> >
> >> >>
> >>
> (org.apache.solr.cloud.ShardLeaderElectionContext.waitForReplicasToComeUp).
> >> >> >
> >> >> > We have 100 collections distributed in 3 pairs of Solr nodes. Each
> >> >> > collection has one shard with 2 replicas.
> >> >> > As I understand from code and logs, all the collections are being
> >> >> > registered synchronously, which means that we have to wait 3
> minutes *
> >> >> > number of collections for the whole cluster to come up. It could be
> >> more
> >> >> > than an hour!
> >> >> >
> >> >> >
> >> >> >
> >> >> > 1. We thought about lowering leaderVoteWait to solve the problem,
> but
> >> we
> >> >> > are not sure what is the risk?
> >> >> >
> >> >> > 2. The following thread is very similar to our case:
> >> >> >
> >> >>
> >>
> http://qnalist.com/questions/4812859/waitforleadertoseedownstate-when-leader-is-down
> >> >> .
> >> >> > Does anybody know if it is indeed a bug and if there's a related
> JIRA
> >> >> issue?
> >> >> >
> >> >> > 3. I see this on logs before the reconnection "Client session timed
> >> out,
> >> >> > have not heard from server in 48865ms for sessionid
> 0x44efbb91b5f0001,
> >> >> > closing socket connection and attempting reconnect", does it mean
> that
> >> >> > there was a disconnection of over 50 seconds between SOLR and
> >> zookeeper?
> >> >> >
> >> >> >
> >> >> > Thanks in advance for your kind answer
> >> >>
> >>
>

Re: Cluster down for long time after zookeeper disconnection

Reply via email to