On Tue, Mar 3, 2020 at 2:37 PM Ondrej Zajicek
wrote:
> On Mon, Mar 02, 2020 at 05:35:01PM +, Neil Jerram wrote:
> > On Sat, Feb 29, 2020 at 1:05 PM Neil Jerram wrote:
> >
> > > I am struggling to understand and ideally eliminate an unwanted flap
> (i.e.
> > > delete and re-add) of an IPv6 route on node M, when a neighbouring
> node R
> > > restarts, and R is configured to advertise that IPv6 route statically.
> [...]
> > >
> >
> > FYI I have just tried the same test again with DBG(...) statements
> compiled
> > in, in the hope that that might reveal strange timing in the static
> > protocol processing. Unfortunately I don't think it revealed anything
> > suspicious. There is a line with
>
> Hi
>
> I would guess that the issue is more in node M than in node R. The fact
> that it happens only for IPv6 can be explained by implementation of
> Kernel protocol, which uses replace operation for IPv4 (since 2.0.5), but
> only remove/add for IPv6. Technically, the kernel protocol should detect
> that the new route is the same as old one and avoid pushing it to the
> kernel, but perhaps for some reason it pushes it anyway and causes the
> flap.
>
> To debug this isse, it would be useful to enable 'debug { events, routes
> }' on BGP and Kernel on node M. One could see if there are any withdraws,
> or just updates.
>
Thanks Ondrej. Here is a case where node R was killed (-9) at 15:50:03 and
restarted at 15:50:08. Node M saw this route flap at 15:50:11:
[2020-03-04T15:50:11.257106] Deleted fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122
via 2001:20::2 dev eth0 proto bird metric 1024 pref medium
[2020-03-04T15:50:11.257459] fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 via
2001:20::2 dev eth0 proto bird metric 1024 pref medium
Here is the node M log with "debug all", for both v4 and v6, from when R
restarted until the route flap:
2020-03-04T15:50:09.246741514Z KRT: Scanning routing table
2020-03-04T15:50:09.246805309Z BGP: connect_timeout
2020-03-04T15:50:09.246820405Z BGP: Closing connection
2020-03-04T15:50:09.246830788Z BGP: Connecting
2020-03-04T15:50:09.246843075Z bird: Mesh_172_17_0_2: Connecting to
172.17.0.2 from local address 172.17.0.3
2020-03-04T15:50:09.246852834Z KRT: Got 2001:20::/64, type=1, oif=36,
table=254, prid=2, proto=kernel1
2020-03-04T15:50:09.246861952Z KRT: Got
fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122, type=1, oif=36, table=254, prid=12,
proto=kernel1
2020-03-04T15:50:09.246871442Z KRT: Got
fd00:10:244:0:586d:4461:e980:a280/128, type=1, oif=6, table=254, prid=3,
proto=kernel1
2020-03-04T15:50:09.246879832Z Running filter
`calico_kernel_programming'...done (2)
2020-03-04T15:50:09.246889555Z KRT: Got
fd00:10:244:0:586d:4461:e980:a280/122, type=6, oif=1, table=254, prid=12,
proto=kernel1
2020-03-04T15:50:09.246898969Z KRT: Got
fd00:10:244:0:58fd:b191:5c13:9cc0/122, type=1, oif=36, table=254, prid=12,
proto=kernel1
2020-03-04T15:50:09.246908792Z Running filter
`calico_kernel_programming'...done (2)
2020-03-04T15:50:09.24691778Z KRT: Got fe80::/64, type=1, oif=36,
table=254, prid=2, proto=kernel1
2020-03-04T15:50:09.246927197Z KRT: Ignoring route - strange class/scope
2020-03-04T15:50:09.246935966Z KRT: Got fe80::/64, type=1, oif=6,
table=254, prid=2, proto=kernel1
2020-03-04T15:50:09.246944298Z KRT: Ignoring route - strange class/scope
2020-03-04T15:50:09.247009224Z KRT: Got ::1/128, type=2, oif=1, table=255,
prid=2, proto=(none)
2020-03-04T15:50:09.247065042Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247088778Z KRT: Got 2001:20::/128, type=4, oif=36,
table=255, prid=2, proto=(none)
2020-03-04T15:50:09.247115256Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247131555Z KRT: Got 2001:20::1/128, type=2, oif=36,
table=255, prid=2, proto=(none)
2020-03-04T15:50:09.247145697Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247159089Z KRT: Got fd00:10:96::ac35/128, type=2,
oif=3, table=255, prid=2, proto=(none)
2020-03-04T15:50:09.24717877Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247192816Z KRT: Got fe80::/128, type=4, oif=36,
table=255, prid=2, proto=(none)
2020-03-04T15:50:09.247206392Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247219343Z KRT: Got fe80::/128, type=4, oif=6,
table=255, prid=2, proto=(none)
2020-03-04T15:50:09.247262729Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247279256Z KRT: Got fe80::42:acff:fe11:3/128, type=2,
oif=36, table=255, prid=2, proto=(none)
2020-03-04T15:50:09.247294287Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247307585Z KRT: Got fe80::ecee:eeff:feee:/128,
type=2, oif=6, table=255, prid=2, proto=(none)
2020-03-04T15:50:09.247322176Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.247335799Z KRT: Got ff00::/8, type=1, oif=36,
table=255, prid=3, proto=(none)
2020-03-04T15:50:09.247349484Z KRT: Ignoring route - unknown table 255
2020-03-04T15:50:09.24736411Z KRT: Got ff00::/8, type=1, oif=2, table=255,
prid=3, proto=(none)
2020-03-04T15:50:09.2473782