[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
** Description changed: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] - see original description + Use networkd-dispatcher events to add and remove IP rules. The example + scripts below are contrived (and by themselves likely to break access to + a machine) but would be adequate to trigger the bug. Put scripts like + these in place, reboot or run "netplan apply", and then leave the + machine running for a few DHCP renewal cycles. + + === /etc/networkd-dispatcher/configured.d/test.sh === + #!/bin/bash + + /sbin/ip rule add iif lo lookup 99 + /sbin/ip rule add to 10.0.0.0/8 iif lo lookup main + === END === + === /etc/networkd-dispatcher/configuring.d/test.sh === + #!/bin/bash + + # Tear down existing ip rules so they aren't duplicated + OLDIFS="${IFS}" + IFS=" + " + for rule in `ip rule show|grep "iif lo" | cut -d: -f2-`; do + IFS="${OLDIFS}" + ip rule delete ${rule} + done + IFS="${OLDIFS}" + === END === [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-ssd /ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/netw
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
The scripts for configured.d and configuring.d to add and remove IP rules (included above) are likely the culprit. @ddstreet would you like me to write that up more compactly? -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Bionic: Incomplete Bug description: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] see original description [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
No crashes on my test machine for 12 days. Push it! -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Bionic: In Progress Bug description: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] see original description [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /etc/networkd-dispatcher/configured.d/nat: # delete link-local route for inside in default table /etc/networkd-dispatche
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
So far so good running the latest package for 10 hours. I'll let it run another day or two but previously I would have seen the issue by now. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Bionic: In Progress Bug description: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] see original description [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /etc/networkd-dispatcher/configure
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
I was not able to reproduce the original issue on 237-3ubuntu10.42~202007071725~ubuntu18.04.1 after letting it run for 12+ hours. I have now installed the newer 237-3ubuntu10.42~202007081907~ubuntu18.04.1 from the same PPA. I no longer see a SEGV when the service first starts at boot, thanks! I will let it run a few hours again to confirm that the original issue has been addressed. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Bionic: In Progress Bug description: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] see original description [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
Here's one of the new coredumps I'm getting at boot now. Note that I don't have debugging symbols installed for the PPA version of systemd. # coredumpctl gdb 714 PID: 714 (systemd-network) UID: 100 (systemd-network) GID: 102 (systemd-network) Signal: 11 (SEGV) Timestamp: Wed 2020-07-08 17:17:01 UTC (8min ago) Command Line: /lib/systemd/systemd-networkd Executable: /lib/systemd/systemd-networkd Control Group: /system.slice/systemd-networkd.service Unit: systemd-networkd.service Slice: system.slice Boot ID: df33bbaec4134b45aaabe8b3fca7dade Machine ID: ec267b3475883f9edb99f554607bb456 Hostname: ip-10-0-4-251 Storage: /var/lib/systemd/coredump/core.systemd-network.100.df33bbaec4134b45aaabe8b3fca7dade.714.159422862100.lz4 Message: Process 714 (systemd-network) of user 100 dumped core. Stack trace of thread 714: #0 0x5627a7f3425c n/a (systemd-networkd) #1 0x5627a7fb5760 n/a (systemd-networkd) #2 0x5627a7f26526 sd_netlink_process (systemd-networkd) #3 0x5627a7f267c3 n/a (systemd-networkd) #4 0x5627a7f2b6be n/a (systemd-networkd) #5 0x5627a7f2b93a sd_event_dispatch (systemd-networkd) #6 0x5627a7f2bac9 sd_event_run (systemd-networkd) #7 0x5627a7f2bd0b sd_event_loop (systemd-networkd) #8 0x5627a7eff3d6 n/a (systemd-networkd) #9 0x7f6d90500b97 __libc_start_main (libc.so.6) #10 0x5627a7effaba n/a (systemd-networkd) ** Attachment added: "core.systemd-network.100.df33bbaec4134b45aaabe8b3fca7dade.714.159422862100.lz4" https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1881972/+attachment/5390826/+files/core.systemd-network.100.df33bbaec4134b45aaabe8b3fca7dade.714.159422862100.lz4 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Bionic: In Progress Bug description: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] see original description [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:#
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
I added the ppa and did a dist-upgrade then rebooted. systemd-networkd now consistently crashes once at boot (looks like a different crash though). But then everything appears to work after networkd restarts once. I will let it run and see if the invalid pointer crash happens. # journalctl -l -u systemd-networkd -b 0 -- Logs begin at Wed 2020-06-03 15:00:29 UTC, end at Wed 2020-07-08 17:20:23 UTC. -- Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: Starting Network Service... Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: Enumeration completed Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: Started Network Service. Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Link UP Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Gained carrier Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Link DOWN Jul 08 17:17:01 ip-10-0-4-251 systemd-networkd[714]: ens5: Lost carrier Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Main process exited, code=dumped, status=11/SEGV Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Failed with result 'core-dump'. Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Service has no hold-off time, scheduling restart. Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: systemd-networkd.service: Scheduled restart job, restart counter is at 1. Jul 08 17:17:01 ip-10-0-4-251 systemd[1]: Stopped Network Service. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Fix Released Status in systemd source package in Bionic: In Progress Bug description: [impact] systemd-networkd double-free causes crash under some circumstances, such as adding/removing ip rules [test case] see original description [regression potential] this strdup's strings during addition of routing policy rules, so any regression would likely occur when adding/modifying/removing ip rules, possibly including networkd segfault or failure to add/remove/modify ip rules. [scope] this is needed for bionic. this is fixed by upstream commit eeab051b28ba6e1b4a56d369d4c6bf7cfa71947c which is included starting in v240, so this is already included in Focal and later. I did not research what original commit introduced the problem, but the reporter indicates this did not happen for Xenial so it's unlikely this is a problem in Xenial or earlier. [original description] This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/ne
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
I've gathered several core dumps now but the stacktraces are identical. The logged reason for dumping core is "free(): invalid pointer". I wonder if there is a race condition with whatever networkd itself is doing when it reconfigures the interface (which it seems to do more aggressively for DHCP renewals than it used to) and what my scripts in /etc/networkd-dispatcher/configured.d and /etc/networkd- dispatcher/configuring.d are doing. Assuming the "routing_policy_rule_free" function is equivalent to "ip rule delete ..." there could be a conflict with my "configuring.d" script in particular. The "configured.d" script adds some extra routing policy rules and the "configuring.d" script deletes them so they aren't duplicated every time the network is configured. ** Changed in: systemd (Ubuntu) Status: Incomplete => New -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: New Bug description: This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /etc/networkd-dispatcher/configured.d/nat: # delete link-local route for inside in default table /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route delete 10.0.3.0/24 2>/dev/null || true /etc/networkd-dispatcher/configured.d/
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
# coredumpctl gdb 28819 PID: 28819 (systemd-network) UID: 100 (systemd-network) GID: 102 (systemd-network) Signal: 6 (ABRT) Timestamp: Tue 2020-06-16 19:36:22 UTC (16min ago) Command Line: /lib/systemd/systemd-networkd Executable: /lib/systemd/systemd-networkd Control Group: /system.slice/systemd-networkd.service Unit: systemd-networkd.service Slice: system.slice Boot ID: 578e8b2c2e1a43afbd27211be1a4f531 Machine ID: ec267b3475883f9edb99f554607bb456 Hostname: ip-10-0-4-251 Storage: /var/lib/systemd/coredump/core.systemd-network.100.578e8b2c2e1a43afbd27211be1a4f531.28819.159233618200.lz4 Message: Process 28819 (systemd-network) of user 100 dumped core. Stack trace of thread 28819: #0 0x7f740d023e97 raise (libc.so.6) #1 0x7f740d025801 abort (libc.so.6) #2 0x7f740d06e897 n/a (libc.so.6) #3 0x7f740d07590a n/a (libc.so.6) #4 0x7f740d07ce1c cfree (libc.so.6) #5 0x55fa5c16276b routing_policy_rule_free (systemd-networkd) #6 0x55fa5c1f69e2 manager_rtnl_process_rule (systemd-networkd) #7 0x55fa5c1731d6 process_match (systemd-networkd) #8 0x55fa5c173413 io_callback (systemd-networkd) #9 0x55fa5c178350 source_dispatch (systemd-networkd) #10 0x55fa5c1785ea sd_event_dispatch (systemd-networkd) #11 0x55fa5c178779 sd_event_run (systemd-networkd) #12 0x55fa5c1789bb sd_event_loop (systemd-networkd) #13 0x55fa5c1413a6 main (systemd-networkd) #14 0x7f740d006b97 __libc_start_main (libc.so.6) #15 0x55fa5c141a8a _start (systemd-networkd) ** Attachment added: "core dump file" https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1881972/+attachment/5384495/+files/core.systemd-network.100.578e8b2c2e1a43afbd27211be1a4f531.28819.159233618200.lz4 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: Incomplete Bug description: This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
Also of note is that on systems with systemd-237-3ubuntu10.33 or older I don't see the "ens5: Configured" log messages at all after initial configuration, even though I'm sure DHCP renewals are happening. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: New Bug description: This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /etc/networkd-dispatcher/configured.d/nat: # delete link-local route for inside in default table /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route delete 10.0.3.0/24 2>/dev/null || true /etc/networkd-dispatcher/configured.d/nat: # add link-local route for inside in table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route replace 10.0.3.0/24 dev ens6 scope link src 10.0.3.171 table ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: # add routes to VPC cidrs via inside gateway in table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route replace 10.0.0.0/16 via 10.0.3.1 table ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: # add rules to use table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip rule add iif ens6 lookup ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: /sbin/ip r
[Touch-packages] [Bug 1881972] Re: systemd-networkd crashes with invalid pointer
The system I pulled the networkd log from had not yet become unreachable but here's a snippet from one that did: May 27 07:38:18 ip-10-0-4-228 systemd[1]: systemd-networkd.service: Failed with result 'core-dump'. May 27 07:38:18 ip-10-0-4-228 systemd[1]: Failed to start Network Service. May 27 07:38:18 ip-10-0-4-228 systemd[1]: Dependency failed for Wait for Network to be Configured. May 27 07:38:18 ip-10-0-4-228 systemd[1]: systemd-networkd-wait-online.service: Job systemd-networkd-wait-online.service/start failed with result 'dependency' May 27 07:38:18 ip-10-0-4-228 systemd[1]: systemd-networkd.socket: Failed with result 'service-start-limit-hit'. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1881972 Title: systemd-networkd crashes with invalid pointer Status in systemd package in Ubuntu: New Bug description: This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm- ssd/ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu- bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /etc/networkd-dispatcher/configured.d/nat: # delete link-local route for inside in default table /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route delete 10.0.3.0/24 2>/dev/null || true /etc/networkd-dispatcher/configured.d/nat: # add link-local route for inside in table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route replace 10.0.3.0/24 dev ens6 scope link src 10.0.3.171 tab
[Touch-packages] [Bug 1881972] [NEW] systemd-networkd crashes with invalid pointer
Public bug reported: This is a serious regression with systemd-networkd that I ran in to while setting up a NAT router in AWS. The AWS AMI ubuntu/images/hvm-ssd /ubuntu-bionic-18.04-amd64-server-20200131 with systemd-237-3ubuntu10.33 does NOT have the problem, but the next most recent AWS AMI ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200311 with systemd-including 237-3ubuntu10.39 does. Also, a system booted from the (good) 20200131 AMI starts showing the problem after updating only systemd (to 237-3ubuntu10.41) and its direct dependencies (e.g. 'apt-get install systemd'). So I'm fairly confident that a change to the systemd package between 237-3ubuntu10.33 and 237-3ubuntu10.39 introduced the problem and it is still present. On the NAT router I use three interfaces and have separate routing tables for admin and forwarded traffic. Things come up fine initially but every 30-60 minutes (DHCP lease renewal time?) one or more interfaces is reconfigured and most of the time systemd-networkd will crash and need to be restarted. Eventually the system becomes unreachable when the default crash loop backoff logic prevents the network service from being restarted at all. The log excerpt attached illustrates the crash loop. Also including the netplan and networkd config files below. # grep . /etc/netplan/* /etc/netplan/50-cloud-init.yaml:# This file is generated from information provided by the datasource. Changes /etc/netplan/50-cloud-init.yaml:# to it will not persist across an instance reboot. To disable cloud-init's /etc/netplan/50-cloud-init.yaml:# network configuration capabilities, write a file /etc/netplan/50-cloud-init.yaml:# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following: /etc/netplan/50-cloud-init.yaml:# network: {config: disabled} /etc/netplan/50-cloud-init.yaml:network: /etc/netplan/50-cloud-init.yaml:version: 2 /etc/netplan/50-cloud-init.yaml:ethernets: /etc/netplan/50-cloud-init.yaml:ens5: /etc/netplan/50-cloud-init.yaml:dhcp4: true /etc/netplan/50-cloud-init.yaml:match: /etc/netplan/50-cloud-init.yaml:macaddress: xx:xx:xx:xx:xx:xx /etc/netplan/50-cloud-init.yaml:set-name: ens5 /etc/netplan/99_config.yaml:network: /etc/netplan/99_config.yaml: version: 2 /etc/netplan/99_config.yaml: renderer: networkd /etc/netplan/99_config.yaml: ethernets: /etc/netplan/99_config.yaml:ens6: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: yy:yy:yy:yy:yy:yy /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-routes: false /etc/netplan/99_config.yaml:ens7: /etc/netplan/99_config.yaml: match: /etc/netplan/99_config.yaml:macaddress: zz:zz:zz:zz:zz:zz /etc/netplan/99_config.yaml: mtu: 1500 /etc/netplan/99_config.yaml: dhcp4: true /etc/netplan/99_config.yaml: dhcp4-overrides: /etc/netplan/99_config.yaml:use-mtu: false /etc/netplan/99_config.yaml:use-routes: false # grep . /etc/networkd-dispatcher/*/* /etc/networkd-dispatcher/configured.d/nat:#!/bin/bash /etc/networkd-dispatcher/configured.d/nat:# Do additional configuration for the inside and outside interfaces /etc/networkd-dispatcher/configured.d/nat:# route table used for forwarded/routed/natted traffic /etc/networkd-dispatcher/configured.d/nat:FWD_TABLE=99 /etc/networkd-dispatcher/configured.d/nat:if [ "${IFACE}" = "ens6" ]; then /etc/networkd-dispatcher/configured.d/nat: # delete link-local route for inside in default table /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route delete 10.0.3.0/24 2>/dev/null || true /etc/networkd-dispatcher/configured.d/nat: # add link-local route for inside in table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route replace 10.0.3.0/24 dev ens6 scope link src 10.0.3.171 table ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: # add routes to VPC cidrs via inside gateway in table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route replace 10.0.0.0/16 via 10.0.3.1 table ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: # add rules to use table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip rule add iif ens6 lookup ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: /sbin/ip rule add oif ens6 lookup ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat: /sbin/ip rule add from 10.0.3.171/32 lookup ${FWD_TABLE} /etc/networkd-dispatcher/configured.d/nat:elif [ "${IFACE}" = "ens7" ]; then /etc/networkd-dispatcher/configured.d/nat: # delete link-local route for outside in default table /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route delete 10.0.2.0/24 2>/dev/null || true /etc/networkd-dispatcher/configured.d/nat: # add link-local route for outside in table 99 /etc/networkd-dispatcher/configured.d/nat: /sbin/ip route replace 10.0.2.0/24 dev ens7 scope link src 10.0.2.