On Tue, Jan 8, 2019 at 7:02 PM Jamo Luhrsen <jluhr...@gmail.com> wrote:

>
>
> On 1/8/19 3:52 PM, Tom Pantelis wrote:
> >
> >
> > On Tue, Jan 8, 2019 at 5:40 PM Jamo Luhrsen <jluhr...@gmail.com <mailto:
> jluhr...@gmail.com>> wrote:
> >
> >     I am debugging a 3node (cluster) netvirt csit job and one of my
> controllers
> >     has a bunch of the messages like this:
> >
> >     2019-01-03T03:04:38,335 | INFO  |
> opendaylight-cluster-data-shard-dispatcher-21 | Shard
>       |
> >     229
> >     - org.opendaylight.controller.sal-clustering-commons - 1.8.2 |
> member-3-shard-default-config (Follower): The log is not
> >     empty but the prevLogIndex 19047 was not found in it - lastIndex:
> 17875, snapshotIndex: -1
> >
> >     2019-01-03T03:04:38,335 | INFO  |
> opendaylight-cluster-data-shard-dispatcher-21 | Shard
>       |
> >     229
> >     - org.opendaylight.controller.sal-clustering-commons - 1.8.2 |
> member-3-shard-default-config (Follower): Follower is
> >     out-of-sync so sending negative reply: AppendEntriesReply [term=23,
> success=false,
> >     followerId=member-3-shard-default-config, logLastIndex=17875,
> logLastTerm=4, forceInstallSnapshot=false,
> >     payloadVersion=9, raftVersion=3]
> >
> >     log:
> >
> https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-fluorine/168/odl_3/odl3_karaf.log.gz
> >
> >     The job has lots of csit failures, so I know something is broken
> somewhere.
> >     I don't know if the above has anything to do with it or not. Maybe
> it's
> >     even expected in this scenario.
> >
> >
> > Those messages can happen if a follower gets behind the leader,
> especially if it gets isolated and AppendEntries
> > messages from the leader get backed up and then the follower gets a
> bunch of messages quickly when it reconnects to the
> > leader. Looks like it eventually caught up to the leader although, from
> the output, the distro you're testing with is
> > missing https://git.opendaylight.org/gerrit/#/c/78929/ which should
> speed up the sync process in that case and alleviate
> > most of those messages.
>
> Thanks Tom, makes sense then as this is on one controller that came up
> last after
> bouncing them all.
>

yeah it eventually synced but it went down an inefficient path that
resulted in 1176 messages from the leader to sync it. Actually that patch I
mentioned above was for a different case and wouldn't help in this case. I
have an idea where we can optimize it to eliminate those extra messages and
make it faster.


>
> JamO
>
> >
> >     Thanks,
> >     JamO
> >     _______________________________________________
> >     controller-dev mailing list
> >     controller-dev@lists.opendaylight.org <mailto:
> controller-dev@lists.opendaylight.org>
> >     https://lists.opendaylight.org/mailman/listinfo/controller-dev
> >
>
_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to