** Description changed: [ Impact ] * netplan-sriov-apply.service can sometimes fail to configure sriov interfaces. * Issue happens when netplan is performing per interface configuration and udev rules - are modifying PF interface names. If that happens netplan will fail to get some PF related data - as expected /sys/class/net/<ifname>/ directory will no longer exist. - + are modifying PF interface names. If that happens netplan will fail to get some PF related data + as expected /sys/class/net/<ifname>/ directory will no longer exist. + * Depending on the timing between netplan-sriov-apply.service and udev rules execution, one or more - PF interfaces might be unconfigured. - + PF interfaces might be unconfigured. + * This issue might a be root cause for following netplan bugs: - - https://bugs.launchpad.net/netplan/+bug/1988018 - - https://bugs.launchpad.net/netplan/+bug/2020409 - + - https://bugs.launchpad.net/netplan/+bug/1988018 + - https://bugs.launchpad.net/netplan/+bug/2020409 + * A proposed solution is to make sure that udev rules are triggered and finished before netplan-sriov-apply.service - starts executing. + starts executing. * Issue was most likely introduced by https://bugs.launchpad.net/netplan/+bug/1988018 - - this change introduced netplan-sriov-apply.service - - jammy 0.107.1-3ubuntu0.22.04.2 is still in -proposed - - noble/questing/resolute released it as part of v1.0 - + - this change introduced netplan-sriov-apply.service + - jammy 0.107.1-3ubuntu0.22.04.2 is still in -proposed + - noble/questing/resolute released it as part of v1.0 + * Issue is reproduced when user specifies set-name config value with a name different than what systemd networkd generated - - During the boot process, interface will first be renamed to ethX, then networkd will apply its PCI address based naming, - and only then udev will process rules created by using set-name config value. - - If set-name is not used or name specified in set-name is the same as the one networkd generated, issue will not reproduce. - + - During the boot process, interface will first be renamed to ethX, then networkd will apply its PCI address based naming, + and only then udev will process rules created by using set-name config value. + - If set-name is not used or name specified in set-name is the same as the one networkd generated, issue will not reproduce. [ Test Plan ] - * Create a netplan config which modifies interface name and sets sriov config, for instance: - 50-if.yaml: - network: - ethernets: - ens1f0: - match: - macaddress: b8:3f:d2:09:38:94 - mtu: 1500 - optional: true - set-name: ens1f0 - ens1f1: - match: - macaddress: b8:3f:d2:09:38:94 - mtu: 1500 - optional: true - set-name: ens1f1 + * Create a netplan config which modifies interface name and sets sriov config, for instance: + 50-if.yaml: + network: + ethernets: + ens1f0: + match: + macaddress: b8:3f:d2:09:38:94 + mtu: 1500 + optional: true + set-name: ens1f0 + ens1f1: + match: + macaddress: b8:3f:d2:09:38:94 + mtu: 1500 + optional: true + set-name: ens1f1 - 99-sriov.yaml: - network: - version: 2 - ethernets: - ens1f0: - virtual-function-count: 32 - embedded-switch-mode: switchdev - delay-virtual-functions-rebind: true - ethernets: - ens1f1: - virtual-function-count: 32 - embedded-switch-mode: switchdev - delay-virtual-functions-rebind: true + 99-sriov.yaml: + network: + version: 2 + ethernets: + ens1f0: + virtual-function-count: 32 + embedded-switch-mode: switchdev + delay-virtual-functions-rebind: true + ethernets: + ens1f1: + virtual-function-count: 32 + embedded-switch-mode: switchdev + delay-virtual-functions-rebind: true + NOTE: name generated for these interfaces by networkd are ens1f0np0 and + ens1f1np1 - NOTE: name generated for these interfaces by networkd are ens1f0np0 and ens1f1np1 + * Reboot the host with above config - * Reboot the host with above config + * After reboot verify if sriov configuration was properly applied on the interface. + Expected result: + Config was properly applied by netplan-sriov-apply.service - * After reboot verify if sriov configuration was properly applied on the interface. - Expected result: - Config was properly applied by netplan-sriov-apply.service - - Actual results: + Actual results: Feb 02 12:15:49 doopliss netplan[1163]: ERROR:root:could not determine vendor and device ID of ens1f1np1: [Errno 2] No such file or directory: '/sys/class/net/ens1f1np1/device/vendor' Feb 02 12:15:49 doopliss systemd[1]: netplan-sriov-apply.service: Main process exited, code=exited, status=1/FAILURE Feb 02 12:15:49 doopliss systemd[1]: netplan-sriov-apply.service: Failed with result 'exit-code'. In this example, netplan-sriov-apply.service started around Feb 02 12:15:27, it properly configured first interface using old name ens1f0np0. Then second interface ens1f1np1 was renamed: Feb 02 12:15:37 doopliss kernel: mlx5_core 0000:4b:00.1 ens1f1: renamed from ens1f1np1 Netplan using name ens1f1np1 failed to get /sys/class/net/ens1f1np1/device/vendor, as new proper path should be /sys/class/net/ens1f1/device/vendor This is just an example, when interface name changes when netplan-sriov.apply.service is running, netplan can fail in different parts of the code which can result in similar Error log: "[Errno 2] No such file or directory" such as mentioned in LP1988018: Apr 16 15:44:44 romano netplan[1171]: failed parsing sriov_totalvfs for ens7f1np1: [Errno 2] No such file or directory: '/sys/class/net/ens7f1np1/device/sriov_totalvfs' - [ Where problems could occur ] - * Proposed change is making sure that udev rules are triggered and done before netplan-sriov-apply.service starts. - Inspecting current `netplan apply` logic shows that this is already performed in the code for `netplan apply` command - but is missing from `netplan apply --sriov-only` which is called by netplan-sriov-apply.service. + * Proposed change is making sure that udev rules are triggered and done before netplan-sriov-apply.service starts. + Inspecting current `netplan apply` logic shows that this is already performed in the code for `netplan apply` command + but is missing from `netplan apply --sriov-only` which is called by netplan-sriov-apply.service. - * If there are any other processes which are modifying interface names, + * If there are any other processes which are modifying interface names, issue can still be reproduced. - * With new change following commands will be executed: - - udevadm control --reload - - udevadm trigger --action=add --subsystem-match=net - - udevadm settle - If any of the commands hangs, service might not start properly and leave interfaces unconfigured. - + * With new change following commands will be executed: + - udevadm control --reload + - udevadm trigger --action=add --subsystem-match=net + - udevadm settle + If any of the commands hangs, service might not start properly and leave interfaces unconfigured. [ Other Info ] - * Issue can be quite reliable reproduced on jammy-proposed + * Issue can be quite reliable reproduced on jammy-proposed - * I was not able to reproduce issue on Noble, when applying the same configuration. Once netplan-sriov-apply.service starts interfaces are already set to proper name. This might points to differences in systemd. - This also doesn't mean that issue can't be reproduced. Service requires already set interface names and current settings does not guarantee that. + * I was not able to reproduce issue on Noble, when applying the same configuration. Once netplan-sriov-apply.service starts interfaces are already set to proper name. This might points to differences in systemd. + This also doesn't mean that issue can't be reproduced. Service requires already set interface names and current settings does not guarantee that. + + * Fix was verified on PS6 environment which reported issues in LP2020409
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2139598 Title: Netplan can crash when applying sriov config To manage notifications about this bug go to: https://bugs.launchpad.net/netplan/+bug/2139598/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
