This happens because the bridge interface comes up with carrier
(specifically, with +LOWER_UP interface flag) and begins setting its
addresses/routes, but then briefly loses carrier.  I'm not sure why the
bridge comes up with +LOWER_UP when ens3 appears to *not* be up:

Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Flags change: +UP 
+LOWER_UP +RUNNING
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Link UP
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Gained carrier
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Setting addresses
...
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Remembering updated 
address: 192.168.122.105/24 (valid forever)
...
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Flags change: -LOWER_UP
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Lost carrier
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Removing address 
192.168.122.105
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: State is configuring, 
dropping config
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: ens3: Joined netdev
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: ens3: Bringing link up
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Flags change: -RUNNING
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Remembering updated 
address: 192.168.122.105/24 (valid forever)
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Addresses set
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Configuring route: dst: 
n/a, src: n/a, gw: 192.168.122.1, prefsrc: n/a, scope: global, table: main, 
proto: static, type: unicast
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Setting routes
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Forgetting address: 
192.168.122.105/24 (valid forever)
...
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: ens3: Flags change: +UP 
+LOWER_UP +RUNNING
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: ens3: Link UP
...
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: ens3: Gained carrier
...
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Flags change: +LOWER_UP 
+RUNNING
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Gained carrier
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Setting addresses
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Could not set route: 
Nexthop has invalid gateway. Network is unreachable
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Failed
...
Apr 27 10:46:56 lp1860926-f systemd-networkd[643]: br0: Remembering updated 
address: 192.168.122.105/24 (valid forever)


Note that on carrier loss, networkd begins removing the address, but doesn't 
actually complete address removal until after carrier is back up, which then 
causes setting the network route to fail, right before the address is added 
again.


With upstream systemd, this doesn't happen, because the bridge interface
comes up without carrier (again, specifically, without +LOWER_UP
interface flag):

Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Flags change: +UP
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Link UP
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Remembering updated 
address: 192.168.122.231/24 (valid forever)
...
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Addresses set
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Configuring route: dst: 
n/a, src: n/a, gw: 192.168.122.1, prefsrc: n/a, scope: global, table: main, 
proto: static, type: unicast
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Setting routes
...
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: ens3: Joined netdev
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: ens3: Bringing link up
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Received remembered 
route: dst: n/a, src: n/a, gw: 192.168.122.1, prefsrc: n/a, scope: global, 
table: main, proto: static, type: unicast
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Routes set
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: State changed: 
configuring -> configured
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: ens3: Flags change: +UP 
+LOWER_UP +RUNNING
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: ens3: Link UP
...
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: ens3: Gained carrier
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: ens3: State changed: 
configuring -> configured
...
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Flags change: +LOWER_UP 
+RUNNING
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Gained carrier
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: State changed: 
configured -> configuring
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Setting addresses
...
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Remembering updated 
address: 192.168.122.231/24 (valid forever)
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Addresses set
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Configuring route: dst: 
n/a, src: n/a, gw: 192.168.122.1, prefsrc: n/a, scope: global, table: main, 
proto: static, type: unicast
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Setting routes
Apr 27 10:23:05 lp1860926-u systemd-networkd[600]: br0: Routes set


Note in both situations, the bridge interface happens twice.  This is because 
netplan adds ConfigureWithoutCarrier=yes to the bridge configuration, which 
causes networkd to start configuring the bridge before it's ready.  However, 
netplan doesn't also set IgnoreCarrierLoss=yes, so networkd removes the bridge 
configuration when it detects carrier loss, which results in a race condition 
to re-setup the bridge when the carrier is lost, since the loss happens in the 
middle of initial configuration.

If the networkd configuration is changed to remove
ConfigureWithoutCarrier=yes (or set it to no/false), or if the networkd
configuration is changed to add IgnoreCarrierLoss=yes (and leave
ConfigureWithoutCarrier=yes), then this bug *appears* fixed, at least in
my testing.  However, I'm still concerned about this race and I haven't
had time to fully analyze the situation of carrier drop during
configuration; I feel like this is still a potential issue regardless of
the ConfigureWithoutCarrier and/or IgnoreCarrierLoss settings.

For now, I'll build a test kernel that defaults IgnoreCarrierLoss= to
the value of ConfigureWithoutCarrier= (and open an upstream bug) to see
if that also fixes this for others affected.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1860926

Title:
  Ubuntu 20.04  Systemd fails to configure bridged network

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1860926/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to