Re: simple test on solr 5.2.1 wrong leader elected on startup

Alessandro Benedetti Fri, 16 Oct 2015 02:05:10 -0700

On 15 October 2015 at 23:54, Matteo Grolla <matteo.gro...@gmail.com> wrote:


> Don't think so,
> the default behaviour at 4), to my knowledge,is to wait 3 minutes
> (leaderVoteWait) for all replicas to come up to avoid electing a leader
> with stale data.

So the observed behaviour is unexpected to me
>

If I read your sequence of events properly i see that at point 4 there is
no replica recovering.
You have only 8984 in the cluster and in the history of the cluster 8983
has been a leader ( during 8984 dead period).
Let's directly go to the code and avoid all the doubts !

The waiting of 3 minutes is in here :

if (!weAreReplacement) {
> allReplicasInLine = waitForReplicasToComeUp(leaderVoteWait);
> }


In our case  weAreReplacement should be true , because 8983 has been a
leader.
So we don't wait, we try to see if should be the leader, but we shouldn't
for this reason :

// maybe active but if the previous leader marked us as down and
> // we haven't recovered, then can't be leader
> final Replica.State lirState =
> zkController.getLeaderInitiatedRecoveryState(collection, shardId,
> core.getCoreDescriptor().getCloudDescriptor().getCoreNodeName());
> if (lirState == Replica.State.DOWN || lirState ==
> Replica.State.RECOVERING) {
> log.warn("Although my last published state is Active, the previous leader
> marked me "+core.getName()
> + " as " + lirState.toString()
> + " and I haven't recovered yet, so I shouldn't be the leader.");
> return false;
> }
> log.info("My last published State was Active, it's okay to be the
> leader.");
> return true;
> }
> log.info("My last published State was "
> + core.getCoreDescriptor().getCloudDescriptor().getLastPublished()
> + ", I won't be the leader.");
> // TODO: and if no one is a good candidate?


The second possible waiting is in registering the core :

// in this case, we want to wait for the leader as long as the leader might
> // wait for a vote, at least - but also long enough that a large cluster has
> // time to get its act together
> String leaderUrl = getLeader(cloudDesc, leaderVoteWait + 600000);
>
> I scrolled a little bit the code, and I think that because is the only
live node 8984, this second wait will not happen, as no leader is there and
an election should have happened before.

I have no time now to get into debug, would be interesting if you can, but
i would bet that with a single alive node , that came alive when no-one was
recovering, it will become the leader.
Actually, waiting for 3 minutes for the old replica to come back ( which
eventually could never happen), it's a little bit counterintuitive, because
more new replicas could come, and it's not so reasonable to keep the
cluster without a leader for 3 minutes ...
I would suggest some debugging to have a better idea of the internals, and
i would be really interesting in a better insight !


> I created a cluster of 2 nodes copying the server dir to node1 and node2
> and using those as solrhome for the nodes
> created the collection with
> bin/solr create -c test
> so it's using the builtin schemaless configuration
>
> there's nothing custom, should be all pretty standard
>

Adding some login should help, maybe you are hitting some additional
waiting time, added after 4.10 .
We should start the research after we have more insights from the logs.

Cheers



>
> 2015-10-15 17:42 GMT+02:00 Alessandro Benedetti <
> benedetti.ale...@gmail.com>
> :
>
> > Hi Matteo,
> >
> > On 15 October 2015 at 16:16, Matteo Grolla <matteo.gro...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >       I'm doing this test
> > > collection test is replicated on two solr nodes running on 8983, 8984
> > > using external zk
> > >
> > > 1)turn OFF solr 8984
> > > 2)add,commit a doc x con solr 8983
> > > 3)turn OFF solr 8983
> > > 4)turn ON solr 8984
> > >
> > At this point 8984 will be elected leader, because only element in the
> > cluster, it can not do anything to recover, so it will not replicate Doc
> x
> >
> > > 5)shortly after (leader still not elected) turn ON solr 8983
> > >
> > I assume that even if you are not able to see it, actually the leader
> > election was already starting, not taking in consideration 8983
> >
> > > 6)8984 is elected as leader
> > >
> > As expected
> >
> > > 7)doc x is present on 8983 but not on 8984 (check issuing a query)
> > >
> > This is expected as well.
> >
> > It is a very edge case, but i expect at the current status, the behaviour
> > you are obtaining is the expected one.
> > Probably the leader election should become smarter, for example, any
> time a
> > node came back to the cluster it should be checked, and in the case it
> > should be the leader a new election triggered.
> > Just thinking loud :)
> >
> >
> >
> > >
> > > attached are the logs of both solr
> > >
> > > BTW I'm using java 1.8.045 on osx yosemite and solr 5.2.1 seems much
> > > slower to startup than solr 4.10.3. it seems waiting on something
> > >
> >
> > I can not see any attached file, do you have any suggester in place ?
> > Anyway is weird as I assume you kept the solrconfig.xml the same.
> > Can you list the components you are currently using ?
> >
> > Cheers
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card - http://about.me/alessandro_benedetti
> > Blog - http://alexbenedetti.blogspot.co.uk
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: simple test on solr 5.2.1 wrong leader elected on startup

Reply via email to