** Changed in: linux (Ubuntu)
       Status: Confirmed => In Progress

** Changed in: linux (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2034099

Title:
  Bridge MTU not applied at first boot, only after netplan
  apply/networkctl reload

Status in linux package in Ubuntu:
  In Progress

Bug description:
  If you set a Bridge's MTU to a specific value using netplan/networkd
  which is different to the MTU of the interface you add, it does not
  appear to be applied correctly during boot.

  However the MTU switches to the correct value when you re-apply the
  configuration with "netplan apply" (or "systemctl restart systemd-
  networkd"; "networkctl reload")

  
  Additionally:
  - It does apply correctly when you create the bridge for the first time 
(after boot), I did not debug why.
  - When deploying the same configuration with Juju & MAAS it works after first 
boot (likely because juju modifies the netplan config and runs netplan apply) 
but still fails on reboot.

  == Cause ==

  Linux bridges, by default, "auto-tune" the bridge MTU to MIN(mtu) of
  all member ports. This is updated each time an interface is added or
  removed (except with vlan aware mode, which uses MAX(mtu) instead).

  If the user changes the MTU of the bridge explicitly, it disables this
  behaviour since 804b854d374e ("net: bridge: disable bridge MTU auto
  tuning if it was set manually") which landed in v4.17.

  This is the expected behaviour, unfortunately that code has a bug in
  that it only works if the bridge MTU is changed after it was created.
  If the bridge MTU is set during bridge creation by passing IFLA_MTU
  (as systemd-networkd and network-manager do), it does not disable
  auto-tuning. Subsequently when the interface is added to the bridge,
  the MTU is auto-tuned to match.

  So while systemd-networkd is setting this MTU initially, it is almost
  immediately changed by the auto-tuning when the member port is added.
  If you reload, it notices the MTU doesn't match and changes it again.
  This change then triggers the relevant code to disable auto-tuning -
  so it won't change again after that.

  To verify this, you can examine the state of BROPT_MTU_SET_BY_USER
  with this drgn
  script:
https://gist.github.com/lathiat/7a3cace35bd28413822c362f76ad2f1a

  You can also capture the MTU being set at creation with this bpftrace
  script:
https://gist.github.com/lathiat/1624723ceef8d17239ae450f03c8eb3b

  It can also be helpful to set the following in a drop-in override using 
"systemctl edit systemd-networkd":
  [Service]
  Environment=SYSTEMD_LOG_LEVEL=debug

  == Use Case ==

  The specific use case that found this bug was wanting to have a VLAN
  with MTU 9000 and the default untagged VLAN interface with MTU 1500.
  For a VLAN sub-interface (e.g. eth0.42) to have MTU 9000 the parent
  interface (eth0) must also have MTU 9000. Even if that's not what you
  actually wanted.

  One way to work around this is to create eth0 and eth0.42 with MTU
  9000 but then create br0 with MTU 1500 containing eth0. Because we are
  intentionally setting the MTU of br0 to a value other than the member
  interfaces the bug is triggered.

  
  However this also happens in other cases and this bug has often not been 
noticed because either

  (a) If the same MTU is desired on the bridge and member ports, setting
  the MTU of the member ports to match result in auto-tuning still
  happening but to the desired value.

  (b) When member ports are created by other tools (e.g. virtualisation
  software such as LXD) they often (but not always) clone the MTU of the
  bridge to the interface before adding it, specifically to avoid this
  behaviour. However not all software clones the bridge MTU before
  adding an interface (so the bridge MTU will get reduced to 1500 when a
  higher value is desired).

  (c) ifupdown specifically sets MTU after creation (using ip link X set
  mtu N), so does not experience this bug, unlike networkd/Network
  Manager which set MTU during creation with IFLA_MTU.

  
  == Test Case ==

  # Notes:
  # - This requires a multi-core VM to reproduce reliably. Single core VMs 
frequently fail to reproduce it. 
  # - Requires a spare interface separate to the primary interface, in the 
exampel we use eth2.
  # - You can take the netplan generated configs from /run/systemd/network and 
use them directly in /etc/systemd/network and apply with "networkctl reload". 
You get the same results.

  # 1. Create simple netplan configuration

  # Ensure that no eth2 configuration exists in the other files, if so, remove 
that.
  grep eth2 /etc/netplan/ -Ri

  cat >> /etc/netplan/60-br0.yaml <<EOF
  network:
    bridges:
      br0:
        addresses:
        - 172.16.1.1/24
        interfaces:
        - eth2
        mtu: 1500
    ethernets:
      eth2:
        mtu: 9000
    version: 2
  EOF

  # 2. Apply configuration the first time
  # For some racey reason, this works when the configuration is first applied 
not during boot

  netplan apply

  grep . /sys/class/net/{br0,eth2}/mtu

  # Result: Correct - br0 has MTU 1500
  # /sys/class/net/br0/mtu:1500
  # /sys/class/net/eth2/mtu:9000

  
  # 3. Reboot
  # Configuration is applied incorrectly on first boot

  reboot

  grep . /sys/class/net/{br0,eth2}/mtu

  # Expected Result: Incorrect - br0 has MTU 9000
  # /sys/class/net/br0/mtu:9000
  # /sys/class/net/eth2/mtu:9000

  # 4. Netplan apply after boot
  # Configuration is fixed when re-applying the config

  netplan apply

  grep . /sys/class/net/{br0,eth2}/mtu

  # Result: Correct - br0 has MTU 1500
  # /sys/class/net/br0/mtu:1500
  # /sys/class/net/eth2/mtu:9000

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034099/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to