Hi Ray/Julian, > * NOTE: The final comment on the upstream GNOME bug claims that the fix > is incomplete. However, it is possible that the running NetworkManager was > not restarted (see Regression Potential notes above), which is why > nm-dhcp-helper is falling back to Event.
This is not the case. Even today, one of the machines is showing the message from the wrapper that indicates the DHCP lease was not correctly applied according to the journal: ➜ sjors@cuba ~ cat /tmp/nm-helper-retries.log Tue Nov 14 07:23:07 CET 2017: needed 5 attempts to update NetworkManager (RENEW). Tue Nov 14 09:17:45 CET 2017: needed 5 attempts to update NetworkManager (RENEW). Tue Nov 14 10:06:58 CET 2017: needed 4 attempts to update NetworkManager (RENEW). This is even though the machine was rebooted yesterday, so the daemon was restarted: ➜ sjors@cuba ~ uptime 23:12:16 up 1 day, 14:04, 3 users, load average: 0,59, 0,49, 0,47 And the machine is using the patched version of the network-manager: ➜ sjors@cuba ~ apt-cache policy network-manager network-manager: Installed: 1.2.6-0ubuntu0.16.04.1screenpoint1 Candidate: 1.2.6-0ubuntu0.16.04.1screenpoint1 Version table: *** 1.2.6-0ubuntu0.16.04.1screenpoint1 100 100 /var/lib/dpkg/status 1.2.6-0ubuntu0.16.04.1 500 500 http://nl.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages However, I have not investigated why this happens, as the wrapper script is an acceptable work-around. I'll report back, however, whether we still have this problem with the updated Xenial packages -- there is always a chance I made an error somewhere. -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1696415 Title: NetworkManager does not update IPv4 address lifetime even though DHCP lease was successfully renewed Status in NetworkManager: Confirmed Status in network-manager package in Ubuntu: Fix Released Status in network-manager source package in Xenial: In Progress Bug description: SRU REQUEST: Debdiff (nm-dhcp-helper.debdiff) attached. Fixed in current Ubuntu zesty and newer: Bionic uses NM 1.8.x. This bug was fixed upstream in 1.4. [Impact] * nm-dhcp-helper sometimes fails to notify NetworkManager of a DHCP lease renewal due to a DBus race condition. * Upstream NetworkManager 1.4 fixes the race condition by changing nm-dhcp-helper's DBus notification from signal "Event" to method "Notify". * Original bug submitter backported NM 1.4's nm-dhcp-helper notification fix to NM 1.2. This SRU request applies that backported patch to Xenial's NM 1.2.x. [Test Case] * Not reliably reproducible. Out of hundreds of machines, only a dozen or so fail to notify NetworkManager of a DHCP lease renewal about 30-50% of the time. (i.e. It's always the same handful of machines that fail.) * All such machines with the patched packages have been fine for weeks, over many dozens of lease renewals. [Regression Potential] * The patch changes both nm-dhcp-helper and NetworkManager itself. As soon as the new packages are unpacked, the new nm-dhcp-helper will be used on DHCP lease renewals, with the new Notify mechanism. Since the running, old NetworkManager is still expecting Event notifications, the patched nm-dhcp-helper has fallback capability to Event. * Once NetworkManager is restarted and is running the patched version, it will have the new Notify support. [Other Info] * Upstream bug w/ patch: https://bugzilla.gnome.org/show_bug.cgi?id=784636 * RHEL bug with links to the 1.4 commits from which the patch was derived: https://bugzilla.redhat.com/show_bug.cgi?id=1373276 * NOTE: The final comment on the upstream GNOME bug claims that the fix is incomplete. However, it is possible that the running NetworkManager was not restarted (see Regression Potential notes above), which is why nm-dhcp-helper is falling back to Event. The remainder of the log messages in that final comment are from a custom wrapper the submitter was running around nm-dhcp-helper. I have deployed the exact same patch (without said wrapper) to real-world systems and tested extensively, and see nothing but successful DHCP lease renewal notifications using D-Bus Notify, not D-Bus Event. ---- I've found an issue on some of our Xenial office machines, causing NetworkManager to drop its IP address lease in some cases when it shouldn't. I'm not sure if the actual bug is in NetworkManager or perhaps dbus or dhclient, but I'll do my best to help to figure out where it is. What appears to happen: * NetworkManager is informed of a new IPv4 lease. * During the lease, dhclient keeps it fresh by renewing it using DHCPREQUESTs regularly. * In spite of this, NetworkManager drops the IP address from the interface when the last reported lease time expires. This happens on various machines, once every few days. We are using a failover DHCP configuration using two machines (192.168.0.3 'bonaire' and 192.168.0.4 'curacao'). The machine where I've done the debugging is called 'pampus' (192.168.0.166). As you can see in the logs, at 01:21:06 NetworkManager reports a new lease with lease time 7200. jun 07 01:21:06 pampus dhclient[1532]: DHCPREQUEST of 192.168.0.166 on eth0 to 192.168.0.4 port 67 (xid=0x3295b440) jun 07 01:21:06 pampus dhclient[1532]: DHCPACK of 192.168.0.166 from 192.168.0.4 jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] address 192.168.0.166 jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] plen 24 (255.255.255.0) jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] gateway 192.168.0.5 jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] server identifier 192.168.0.4 jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] lease time 7200 jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] nameserver '192.168.0.3' jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] nameserver '192.168.0.4' jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9530] domain name 'office.screenpointmed.com' jun 07 01:21:06 pampus NetworkManager[1161]: <info> [1496791266.9531] dhcp4 (eth0): state changed bound -> bound After this, dhclient is supposed to keep the lease fresh, which it does. E.g. at 03:13:19 you can see a DHCPREQUEST and DHCPACK; I've seen this DHCPACK in a tcpdump and it contains a new lease time of 7200 seconds. jun 07 03:13:19 pampus dhclient[1532]: DHCPREQUEST of 192.168.0.166 on eth0 to 192.168.0.4 port 67 (xid=0x3295b440) jun 07 03:13:19 pampus dhclient[1532]: DHCPACK of 192.168.0.166 from 192.168.0.4 jun 07 03:13:19 pampus dhclient[1532]: bound to 192.168.0.166 -- renewal in 2708 seconds. However, at 03:21:07 (exactly 2 hours and 1 second after the last lease reported by NetworkManager) Avahi and NTP report that the IP address is gone: jun 07 03:21:07 pampus avahi-daemon[1167]: Withdrawing address record for 192.168.0.166 on eth0. jun 07 03:21:07 pampus avahi-daemon[1167]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.0.166. jun 07 03:21:07 pampus avahi-daemon[1167]: Interface eth0.IPv4 no longer relevant for mDNS. jun 07 03:21:08 pampus ntpd[18832]: Deleting interface #3 eth0, 192.168.0.166#123, interface stats: received=2512, sent=2549, dropped=0, active_time=111819 secs So I suspect NetworkManager dropped the IP address from the interface, because it wasn't informed by dhclient that the lease was renewed. The logs don't explicitly say this, so I may have to turn on more verbose debugging logs in NetworkManager or dhclient to verify this. ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: network-manager 1.2.6-0ubuntu0.16.04.1 ProcVersionSignature: Ubuntu 4.4.0-66.87-generic 4.4.44 Uname: Linux 4.4.0-66-generic x86_64 NonfreeKernelModules: nvidia_uvm nvidia_drm nvidia_modeset nvidia ApportVersion: 2.20.1-0ubuntu2.6 Architecture: amd64 Date: Wed Jun 7 14:48:59 2017 IfupdownConfig: # interfaces(5) file used by ifup(8) and ifdown(8) auto lo iface lo inet loopback InstallationDate: Installed on 2016-11-04 (214 days ago) InstallationMedia: Ubuntu 14.04.5 LTS "Trusty Tahr" - Release amd64 (20160803) IpRoute: default via 192.168.0.5 dev eth0 proto static metric 100 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.166 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.166 metric 100 IwConfig: lo no wireless extensions. eth1 no wireless extensions. eth0 no wireless extensions. NetworkManager.state: [main] NetworkingEnabled=true WirelessEnabled=true WWANEnabled=true WimaxEnabled=true RfKill: SourcePackage: network-manager UpgradeStatus: No upgrade log present (probably fresh install) nmcli-con: NAME UUID TYPE TIMESTAMP TIMESTAMP-REAL AUTOCONNECT AUTOCONNECT-PRIORITY READONLY DBUS-PATH ACTIVE DEVICE STATE ACTIVE-PATH Wired connection 1 37da1802-e1ce-3326-a6d0-f855cc32806d 802-3-ethernet 1496839466 wo 07 jun 2017 14:44:26 CEST yes 4294966297 no /org/freedesktop/NetworkManager/Settings/0 yes eth0 activated /org/freedesktop/NetworkManager/ActiveConnection/0 Wired connection 2 a040d7fe-3c52-39ba-82b8-50fad0b602c1 802-3-ethernet 1496399665 vr 02 jun 2017 12:34:25 CEST yes 4294966297 no /org/freedesktop/NetworkManager/Settings/1 no -- -- -- nmcli-dev: DEVICE TYPE STATE DBUS-PATH CONNECTION CON-UUID CON-PATH eth0 ethernet connected /org/freedesktop/NetworkManager/Devices/0 Wired connection 1 37da1802-e1ce-3326-a6d0-f855cc32806d /org/freedesktop/NetworkManager/ActiveConnection/0 eth1 ethernet unavailable /org/freedesktop/NetworkManager/Devices/2 -- -- -- lo loopback unmanaged /org/freedesktop/NetworkManager/Devices/1 -- -- -- nmcli-nm: RUNNING VERSION STATE STARTUP CONNECTIVITY NETWORKING WIFI-HW WIFI WWAN-HW WWAN running 1.2.6 connected started full enabled enabled enabled enabled enabled To manage notifications about this bug go to: https://bugs.launchpad.net/network-manager/+bug/1696415/+subscriptions -- Mailing list: https://launchpad.net/~desktop-packages Post to : desktop-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~desktop-packages More help : https://help.launchpad.net/ListHelp