I am struggling to understand and ideally eliminate an unwanted flap (i.e.
delete and re-add) of an IPv6 route on node M, when a neighbouring node R
restarts, and R is configured to advertise that IPv6 route statically.

Here is the config on R to advertise the route:

protocol static {
   # IP blocks for this host.
   route fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 blackhole;
}

BIRD6 on node R is killed (with -9) at 12:22:08, and restarts (with the -R
flag) at 12:22:14.

The flap (in the kernel routing table) is detected by running "ip -ts
monitor route" on the monitor node M.  It reports this at 12:22:17:

[2020-02-29T12:22:17.954263] Deleted fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122
via 2001:20::2 dev eth0 proto bird metric 1024 pref medium
[2020-02-29T12:22:17.954470] fd00:10:244:0:1cc0:b1ac:ad47:e7c0/122 via
2001:20::2 dev eth0 proto bird metric 1024 pref medium

Here is the BIRD6 log on R, from when it restarted, until it reached a
steady state.  R's peering to M is "Mesh_2001_20__1".

2020-02-29T12:22:14.214961381Z bird: device1: Initializing
2020-02-29T12:22:14.215018144Z bird: direct1: Initializing
2020-02-29T12:22:14.215035765Z bird: Mesh_2001_20__8: Initializing
2020-02-29T12:22:14.215047665Z bird: Mesh_2001_20__1: Initializing
2020-02-29T12:22:14.215057649Z bird: Mesh_2001_20__3: Initializing
2020-02-29T12:22:14.215144477Z bird: device1: Starting
2020-02-29T12:22:14.215369797Z bird: device1: Connected to table master
2020-02-29T12:22:14.215402544Z bird: device1: State changed to feed
2020-02-29T12:22:14.215431056Z bird: direct1: Starting
2020-02-29T12:22:14.215439899Z bird: direct1: Connected to table master
2020-02-29T12:22:14.215447877Z bird: direct1: State changed to feed
2020-02-29T12:22:14.215456544Z bird: Mesh_2001_20__8: Starting
2020-02-29T12:22:14.215523493Z bird: Mesh_2001_20__3: Starting
2020-02-29T12:22:14.215716006Z bird: Graceful restart started
2020-02-29T12:22:14.215742864Z bird: Started
2020-02-29T12:22:14.215757668Z bird: direct1: State changed to up
2020-02-29T12:22:14.215770554Z bird: device1: State changed to up
2020-02-29T12:22:14.948167365Z bird: Mesh_2001_20__3: Connected to table
master
2020-02-29T12:22:14.948234648Z bird: Mesh_2001_20__3: State changed to wait
2020-02-29T12:22:15.951638881Z bird: Mesh_2001_20__8: Connected to table
master
2020-02-29T12:22:15.951703218Z bird: Mesh_2001_20__8: State changed to wait
2020-02-29T12:22:16.953066987Z bird: Mesh_2001_20__1: Connected to table
master
2020-02-29T12:22:16.953132676Z bird: Mesh_2001_20__1: State changed to wait
2020-02-29T12:22:16.953154658Z bird: Graceful restart done
2020-02-29T12:22:16.95316785Z bird: Mesh_2001_20__8: State changed to feed
2020-02-29T12:22:16.953180658Z bird: Mesh_2001_20__1: State changed to feed
2020-02-29T12:22:16.953194942Z bird: Mesh_2001_20__3: State changed to feed
2020-02-29T12:22:17.953827137Z bird: Mesh_2001_20__8: State changed to up
2020-02-29T12:22:17.953880171Z bird: Mesh_2001_20__1: State changed to up
2020-02-29T12:22:17.953894204Z bird: Mesh_2001_20__3: State changed to up

(There could be some errors here, because I've manually separated logs from
both BIRD and BIRD6 that were going into the same file, and they both log
with prefix "bird:".)

And on M, from when R was killed, until reaching a steady state following R
restart.  M's peering to R is "Mesh_2001_20__2".

2020-02-29T12:22:08.943087384Z bird: Mesh_2001_20__2: State changed to start
2020-02-29T12:22:16.952861163Z bird: Mesh_2001_20__2: State changed to feed
2020-02-29T12:22:16.952950901Z bird: Mesh_2001_20__2: State changed to up

Can anyone help to explain why I get the IPv6 route flap on node M, and if
there is a way of eliminating it?  The peering config on R is

# Template for all BGP clients
template bgp bgp_template {
  debug { states };
  description "Connection to BGP peer";
  local as 64512;
  multihop;
  gateway recursive;
  import all;
  export filter calico_export_to_bgp_peers;
  source address 2001:20::2;
  add paths on;
  graceful restart;
  connect delay time 2;
  connect retry time 5;
  error wait time 5,30;
}
protocol bgp Mesh_2001_20__1 from bgp_template {
  neighbor 2001:20::1 as 64512;
}

and the same on M but with ::1 and ::2 swapped, and __2 instead of __1.

By the way, my setup also has exactly parallel IPv4 config and routes, and
I reliably do _not_ see a similar flap for the corresponding IPv4 route.

Many thanks,
    Neil

Reply via email to