On Tue, Jan 8, 2019 at 7:02 PM Jamo Luhrsen <jluhr...@gmail.com> wrote:
> > > On 1/8/19 3:52 PM, Tom Pantelis wrote: > > > > > > On Tue, Jan 8, 2019 at 5:40 PM Jamo Luhrsen <jluhr...@gmail.com <mailto: > jluhr...@gmail.com>> wrote: > > > > I am debugging a 3node (cluster) netvirt csit job and one of my > controllers > > has a bunch of the messages like this: > > > > 2019-01-03T03:04:38,335 | INFO | > opendaylight-cluster-data-shard-dispatcher-21 | Shard > | > > 229 > > - org.opendaylight.controller.sal-clustering-commons - 1.8.2 | > member-3-shard-default-config (Follower): The log is not > > empty but the prevLogIndex 19047 was not found in it - lastIndex: > 17875, snapshotIndex: -1 > > > > 2019-01-03T03:04:38,335 | INFO | > opendaylight-cluster-data-shard-dispatcher-21 | Shard > | > > 229 > > - org.opendaylight.controller.sal-clustering-commons - 1.8.2 | > member-3-shard-default-config (Follower): Follower is > > out-of-sync so sending negative reply: AppendEntriesReply [term=23, > success=false, > > followerId=member-3-shard-default-config, logLastIndex=17875, > logLastTerm=4, forceInstallSnapshot=false, > > payloadVersion=9, raftVersion=3] > > > > log: > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-fluorine/168/odl_3/odl3_karaf.log.gz > > > > The job has lots of csit failures, so I know something is broken > somewhere. > > I don't know if the above has anything to do with it or not. Maybe > it's > > even expected in this scenario. > > > > > > Those messages can happen if a follower gets behind the leader, > especially if it gets isolated and AppendEntries > > messages from the leader get backed up and then the follower gets a > bunch of messages quickly when it reconnects to the > > leader. Looks like it eventually caught up to the leader although, from > the output, the distro you're testing with is > > missing https://git.opendaylight.org/gerrit/#/c/78929/ which should > speed up the sync process in that case and alleviate > > most of those messages. > > Thanks Tom, makes sense then as this is on one controller that came up > last after > bouncing them all. > yeah it eventually synced but it went down an inefficient path that resulted in 1176 messages from the leader to sync it. Actually that patch I mentioned above was for a different case and wouldn't help in this case. I have an idea where we can optimize it to eliminate those extra messages and make it faster. > > JamO > > > > > Thanks, > > JamO > > _______________________________________________ > > controller-dev mailing list > > controller-dev@lists.opendaylight.org <mailto: > controller-dev@lists.opendaylight.org> > > https://lists.opendaylight.org/mailman/listinfo/controller-dev > > >
_______________________________________________ controller-dev mailing list controller-dev@lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/controller-dev