Attached is a debdiff for systemd on Bionic which fixes this bug.

** Description changed:

- Two servers today that updated systemd to "systemd 237-3ubuntu10.54" 
- https://ubuntu.com/security/notices/USN-5583-1
+ [Impact]
  
- could not resolve dns anymore.
- no dns servers, normally set through dhcp.
+ A widespread outage was caused on Azure instances earlier today, when
+ systemd 237-3ubuntu10.54 was published to the bionic-security pocket.
+ Instances could no longer resolve DNS queries, breaking networking.
  
- Ubuntu 18.04
+ For affected users, the following workarounds are available. Use whatever is 
most convenient.
+ - Reboot your instances
+ - or -
+ - Issue "udevadm trigger -cadd -yeth0 && systemctl restart systemd-networkd" 
as root
  
- Temp fix.
-  1. Edit /etc/systemd/resolved.conf
-  1. Add/Uncomment # FallbackDNS=168.63.129.16
-  1. Restart systemd-resolved sudo systemctl restart systemd-resolved.service
-  1. Confirm dns working with systemd-resolve google.com
+ The trigger was found to be open-vm-tools issuing "udevadm trigger".
+ Azure has a specific netplan setup that uses the `driver` match to set
+ up networking. If a udevadm trigger is executed, the KV pair that
+ contains this info is lost. Next time netplan is executed, the server
+ loses it's DNS information.
+ 
+ This is the same as bug 1902960 experienced on Focal two years ago.
+ 
+ The root cause was found to be a bug in systemd, where if we receive a
+ "Remove" action from a change uevent, we need to run net_setup_link(),
+ we need to skip device rename and keep the old name.
+ 
+ [Testcase]
+ 
+ Start an instance up on Azure, any type. Simply issue udevadm trigger
+ and reload systemd-networkd:
+ 
+ $ ping google.com
+ PING google.com (172.253.62.102) 56(84) bytes of data.
+ 64 bytes from bc-in-f102.1e100.net (172.253.62.102): icmp_seq=1 ttl=56 
time=1.85 ms
+ $ sudo udevadm trigger && sudo systemctl restart systemd-networkd
+ $ ping google.com
+ ping: google.com: Temporary failure in name resolution
+ 
+ To fix a broken instance, you can run:
+ 
+ $ sudo udevadm trigger -cadd -yeth0 && sudo systemctl restart systemd-
+ networkd
+ 
+ and then install the test packages below:
+ 
+ Test packages are available in the following ppa:
+ 
+ https://launchpad.net/~mruffell/+archive/ubuntu/sf343528-test
+ 
+ If you install them, the issue should no longer occur.
+ 
+ [Where problems could occur]
+ 
+ If a regression were to occur, it would affect systemd-udevd processing
+ 'change' events from network devices, which could lead to network
+ outages. Since this would happen when systemd-networkd is restarted on
+ postinstall, a regression would cause widespread outages due to this SRU
+ being targeted to the security pocket, where unattended-upgrades will
+ automatically install from.
+ 
+ Side effects could include incorrect udevd device properties.
+ 
+ It is very important that this SRU is well tested before release.
+ 
+ [Other info]
+ 
+ This was fixed in Systemd 247 with the following commit:
+ 
+ commit e0e789c1e97e2cdf1cafe0c6b7d7e43fa054f151
+ Author: Yu Watanabe <watanabe.yu+git...@gmail.com>
+ Date: Mon, 14 Sep 2020 15:21:04 +0900
+ Subject: udev: re-assign ID_NET_DRIVER=, ID_NET_LINK_FILE=, ID_NET_NAME= 
properties on non-'add' uevent
+ Link: 
https://github.com/systemd/systemd/commit/e0e789c1e97e2cdf1cafe0c6b7d7e43fa054f151
+ 
+ This was backported to Focal's systemd 245.4-4ubuntu3.4 in bug 1902960
+ two years ago. Focal required a heavy backport, which was performed by
+ Dan Streetman. Focals backport can be found in d/p/lp1902960-udev-re-
+ assign-ID_NET_DRIVER-ID_NET_LINK_FILE-ID_NET.patch, or the below
+ pastebin:
+ 
+ https://paste.ubuntu.com/p/K5k7bGt3Wx/
+ 
+ The changes between the Focal backport and the Bionic backport are:
+ 
+ - We use udev_device_get_action() instead of device_get_action()
+ - device_action_from_string() is used to get to enum DeviceAction
+ - We return 0 from the "if (a == DEVICE_ACTION_MOVE) " hunk instead of "goto 
no_rename"
+ - log_device_* has been changed to log_*.
+ 
+ See attached debdiff for Bionic backport.

** Summary changed:

- Update to systemd 237-3ubuntu10.54 broke dns
+ systemd-udevd: Run net_setup_link on 'change' uevents to prevent DNS outages 
on Azure

** Patch added: "Debdiff for systemd on Bionic"
   
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1988119/+attachment/5612617/+files/lp1988119_bionic.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1988119

Title:
  systemd-udevd: Run net_setup_link on 'change' uevents to prevent DNS
  outages on Azure

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Bionic:
  In Progress

Bug description:
  [Impact]

  A widespread outage was caused on Azure instances earlier today, when
  systemd 237-3ubuntu10.54 was published to the bionic-security pocket.
  Instances could no longer resolve DNS queries, breaking networking.

  For affected users, the following workarounds are available. Use whatever is 
most convenient.
  - Reboot your instances
  - or -
  - Issue "udevadm trigger -cadd -yeth0 && systemctl restart systemd-networkd" 
as root

  The trigger was found to be open-vm-tools issuing "udevadm trigger".
  Azure has a specific netplan setup that uses the `driver` match to set
  up networking. If a udevadm trigger is executed, the KV pair that
  contains this info is lost. Next time netplan is executed, the server
  loses it's DNS information.

  This is the same as bug 1902960 experienced on Focal two years ago.

  The root cause was found to be a bug in systemd, where if we receive a
  "Remove" action from a change uevent, we need to run net_setup_link(),
  we need to skip device rename and keep the old name.

  [Testcase]

  Start an instance up on Azure, any type. Simply issue udevadm trigger
  and reload systemd-networkd:

  $ ping google.com
  PING google.com (172.253.62.102) 56(84) bytes of data.
  64 bytes from bc-in-f102.1e100.net (172.253.62.102): icmp_seq=1 ttl=56 
time=1.85 ms
  $ sudo udevadm trigger && sudo systemctl restart systemd-networkd
  $ ping google.com
  ping: google.com: Temporary failure in name resolution

  To fix a broken instance, you can run:

  $ sudo udevadm trigger -cadd -yeth0 && sudo systemctl restart systemd-
  networkd

  and then install the test packages below:

  Test packages are available in the following ppa:

  https://launchpad.net/~mruffell/+archive/ubuntu/sf343528-test

  If you install them, the issue should no longer occur.

  [Where problems could occur]

  If a regression were to occur, it would affect systemd-udevd
  processing 'change' events from network devices, which could lead to
  network outages. Since this would happen when systemd-networkd is
  restarted on postinstall, a regression would cause widespread outages
  due to this SRU being targeted to the security pocket, where
  unattended-upgrades will automatically install from.

  Side effects could include incorrect udevd device properties.

  It is very important that this SRU is well tested before release.

  [Other info]

  This was fixed in Systemd 247 with the following commit:

  commit e0e789c1e97e2cdf1cafe0c6b7d7e43fa054f151
  Author: Yu Watanabe <watanabe.yu+git...@gmail.com>
  Date: Mon, 14 Sep 2020 15:21:04 +0900
  Subject: udev: re-assign ID_NET_DRIVER=, ID_NET_LINK_FILE=, ID_NET_NAME= 
properties on non-'add' uevent
  Link: 
https://github.com/systemd/systemd/commit/e0e789c1e97e2cdf1cafe0c6b7d7e43fa054f151

  This was backported to Focal's systemd 245.4-4ubuntu3.4 in bug 1902960
  two years ago. Focal required a heavy backport, which was performed by
  Dan Streetman. Focals backport can be found in d/p/lp1902960-udev-re-
  assign-ID_NET_DRIVER-ID_NET_LINK_FILE-ID_NET.patch, or the below
  pastebin:

  https://paste.ubuntu.com/p/K5k7bGt3Wx/

  The changes between the Focal backport and the Bionic backport are:

  - We use udev_device_get_action() instead of device_get_action()
  - device_action_from_string() is used to get to enum DeviceAction
  - We return 0 from the "if (a == DEVICE_ACTION_MOVE) " hunk instead of "goto 
no_rename"
  - log_device_* has been changed to log_*.

  See attached debdiff for Bionic backport.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1988119/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to