[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
I realize now we had a second issue on arm64 on Ubuntu 20.04 a bit before that resulted from the same patch, we got around that one without looking deeply by changing kernels. At the time we declined to investigate. So, I have a further question: so, I suspect the reason this slipped through Ubuntu's QA process is it probably only uses qemu, which must work at least somewhat well enough. I don't know how the author (Jason) found he had introduced a defect so quickly upstream, but the issue existed in upstream linux only briefly, and got magnified by the Ubuntu release process. A simple answer would be: I will run proposed kernels and write in if anything weird happens. But I imagine Ubuntu has a more involved test suite to qualify kernels. Can an alternative vmm somehow get into the testing matrix? It's also possible that Firecracker and Crosvm would have suffered this issue, as they all share Rust crates to do some of the implementation. Secondly, are there frequently refreshed upstream packages of kernels (vanilla, ubuntu modified both) in apt repositories I could add to this testing matrix, to catch bugs as close to the source and as early as possible, including in linux upstream? As an administrative matter, I could somewhat easily test boot and some workload on kernels a, b, c, etcetera if the packaging is there. It's no doubt worth our while to check basic functionality working regularly, but I don't think we can build up Ubuntu's full acceptance infrastructure, which would be the next level of sophistication. There may also be some intermediate methods, e.g. if there's some Ubuntu tests we can run in a VM, even if it is not all of them. Let me know if this is the place to discuss this. It's obviously not *exactly* the right place...perhaps a mailing list? But maybe the answers are simple enough. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
Alright, I will do what I did, if it comes up again. I will try to augment our tests to include an apt upgrade with -proposed. Thank you. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170 [virtio_net] [ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8 6a 72 c1 d8 84 c0 75 11 eb 56 48 8b 7b 08 e8 6b 5e c1 d8 84 c0 75 17 f3 90 <48> 8b 7b 08 48 8d b5 6c ff ff ff e8 f5 71 c1 d8 48 85 c0 74 dc 48 [ 28.134326] RSP: 0018:9b0c4064f9b8 EFLAGS: 0246 [ 28.134720] RAX: RBX: 89dfc0d13980 RCX: 0a20 [ 28.135252] RDX: RSI: 9b0c4064f9bc RDI: 89dfc7cc00c0 [ 28.135787] RBP: 9b0c4064fa50 R08: 0001 R09: 0003 [ 28.136316] R10: 0003 R11:
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
You can file a bug just like you did. If you happen to do a bit more debugging, then making sure it gets assigned to the correct package helps. For the kernel, always just file a new bug, unless someone has the same issue as you and you can find the bug via google or the all bugs category on the kernel package on launchpad. For server packages like systemd, bind9, etc, its best to file those issues against the package itself. You can review the changelog to see what changed lately, and say, if the latest change causes a regression, you can write a comment to the launchpad bug for that change instead of filing a new report. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170 [virtio_net] [ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8 6a 72 c1
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
Looking at Jason Wang's commits, it seems like it must work at least a little with qemu (because he committed it) but either 1) not *that* well or 2) someone else told him about trying to run firecracker or some other VMM that more closely assumes the specification in this area, since he modified it a couple of weeks later. Indeed we could install the new kernel and give it a try, it's arguably just a bit more complicated than "try this image URL" on a loop. We will probably do something like that. What's the best way to file a bug if we spot something doing that? Same way I did this one, or some other way? I somehow was expecting to fill in a bit more optional structured metadata when I filed (version, distro, etc) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
Great, so the fix works as intended. 5.15.0-91-generic should be released to -updates this week hopefully. I don't think the CPC team builds any cloud images with -proposed enabled. I asked around, and I'll let you know if there does happen to be some images built. For the meantime you could launch a normal image and add a cloud-init instruction to enable -proposed and do an apt upgrade, and then reboot, and then you can run your tests from there. The kernel team do a lot of validation every cycle, so I am kind of surprised they didn't catch this bug. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170 [virtio_net] [ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8 6a 72 c1 d8 84 c0 75 11 eb 56 48 8b 7b 08 e8 6b 5e c1 d8 84 c0 75 17 f3 90 <48>
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
Indeed, it boots after following your instructions. ubi@vm1fmdye:~$ uname -a Linux vm1fmdye 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170 [virtio_net] [ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8 6a 72 c1 d8 84 c0 75 11 eb 56 48 8b 7b 08 e8 6b 5e c1 d8 84 c0 75 17 f3 90 <48> 8b 7b 08 48 8d b5 6c ff ff ff e8 f5 71 c1 d8 48 85 c0 74 dc 48 [ 28.134326] RSP: 0018:9b0c4064f9b8 EFLAGS: 0246 [ 28.134720] RAX: RBX: 89dfc0d13980 RCX: 0a20 [ 28.135252] RDX: RSI: 9b0c4064f9bc RDI: 89dfc7cc00c0 [ 28.135787] RBP: 9b0c4064fa50 R08: 0001 R09:
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
We can check it out, I have a question though, is there a way to get a pre-captured image with -proposed in it? We could integrate this into a CI system to do more monitoring. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170 [virtio_net] [ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8 6a 72 c1 d8 84 c0 75 11 eb 56 48 8b 7b 08 e8 6b 5e c1 d8 84 c0 75 17 f3 90 <48> 8b 7b 08 48 8d b5 6c ff ff ff e8 f5 71 c1 d8 48 85 c0 74 dc 48 [ 28.134326] RSP: 0018:9b0c4064f9b8 EFLAGS: 0246 [ 28.134720] RAX: RBX: 89dfc0d13980 RCX: 0a20 [ 28.135252] RDX: RSI: 9b0c4064f9bc RDI: 89dfc7cc00c0 [ 28.135787] RBP: 9b0c4064fa50 R08: 0001 R09: 0003 [
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
Hi Daniel, Thanks for reporting. I had a look into the followup commit you mentioned: ~/Work/kernel/ubuntu-jammy$ git log --grep "virtio-net: set queues after driver_ok" origin/master-next commit c6c83b9055f44bcb2bc2fae32323c0a1510c7656 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok BugLink: https://bugs.launchpad.net/bugs/2038486 commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader So, "virtio-net: set queues after driver_ok" is applied, and is tagged for: ~/Work/kernel/ubuntu-jammy$ git describe --contains c6c83b9055f44bcb2bc2fae32323c0a1510c7656 Ubuntu-5.15.0-90.100~161 Looking at 5.15.0-91-generic, it seems to include all patches in 5.15.0-90-generic, with a single revert to fix a USB regression. 5.15.0-91-generic is currently in jammy -proposed, and I believe we are looking at a release this coming week, as per https://kernel.ubuntu.com/. If you would like to try it now and verify that it does fix the issue, start an older cloud image that has an older kernel so it boots correctly, then enable -proposed and install the 5.15.0-91-generic kernel: Instructions to install (On a Jammy system): 1) cat << EOF | sudo tee /etc/apt/sources.list.d/ubuntu-$(lsb_release -cs)-proposed.list # Enable Ubuntu proposed archive deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -cs)-proposed main universe EOF 2) sudo apt update 3) sudo apt install linux-image-5.15.0-91-generic linux-modules-5.15.0-91-generic linux-modules-extra-5.15.0-91-generic linux-headers-5.15.0-91-generic 4) sudo reboot 5) uname -rv 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 The instance should reboot properly and be available to ssh into. This should be fixed in a couple of days once 5.15.0-91-generic is released and built into new cloud-images. Thanks, Matthew ** Tags added: seg -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue")
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
** Also affects: linux (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: linux (Ubuntu) Status: New => Fix Released ** Changed in: linux (Ubuntu Jammy) Status: New => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd virtio_net(+) net_failover virtio_rng failover virtio_blk [ 28.131396] CPU: 1 PID: 165 Comm: systemd-udevd Not tainted 5.15.0-89-generic https://github.com/ubicloud/ubicloud/pull/99-Ubuntu [ 28.131997] Hardware name: Cloud Hypervisor cloud-hypervisor, BIOS 0 [ 28.132479] RIP: 0010:virtnet_send_command+0x10b/0x170 [virtio_net] [ 28.132951] Code: 0b 83 c1 d8 85 c0 0f 88 d2 6e 00 00 48 8b 7b 08 e8 6a 72 c1 d8 84 c0 75 11 eb 56 48 8b 7b 08 e8 6b 5e c1 d8 84 c0 75 17 f3 90 <48> 8b 7b 08 48 8d b5 6c ff ff ff e8 f5 71 c1 d8 48 85 c0 74 dc 48 [ 28.134326] RSP: 0018:9b0c4064f9b8 EFLAGS: 0246 [ 28.134720] RAX: RBX: 89dfc0d13980 RCX: 0a20 [ 28.135252] RDX: RSI: 9b0c4064f9bc RDI: 89dfc7cc00c0 [ 28.135787] RBP: 9b0c4064fa50 R08:
[Kernel-packages] [Bug 2045443] Re: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot
It looks like the -proposed kernel has picked up that missing patch: ~/c/linux ((Ubuntu-5.15.0-91.101))> git log --grep='virtio-net: set queues after driver_ok' commit c6c83b9055f44bcb2bc2fae32323c0a1510c7656 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok BugLink: https://bugs.launchpad.net/bugs/2038486 commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2045443 Title: Ubuntu-5.15.0-89.99 breaks virtio-net spec and doesn't boot Status in linux package in Ubuntu: New Bug description: I use a non-qemu VMM, cloud-hypervisor. It looks like a patch was applied, that introduced a bug, a week later another patch got written to fix that bug, and that second patch was not applied in Ubuntu's release, but is seen in Greg KH's 5.15 branch. The result of the bug is the kernel will not boot. Cumulative diff: ``` > git diff Ubuntu-5.15.0-86.96 Ubuntu-5.15.0-89.99 drivers/net/virtio_net.c diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 0351f86494f1..af335f8266c2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3319,6 +3319,8 @@ static int virtnet_probe(struct virtio_device *vdev) } } + _virtnet_set_queues(vi, vi->curr_queue_pairs); + /* serialize netdev register + virtio_device_ready() with ndo_open() */ rtnl_lock(); @@ -3339,8 +3341,6 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_queues(vi, vi->curr_queue_pairs); - /* Assume link up if device can't report link status, otherwise get link status from config. */ netif_carrier_off(dev); ``` Blamed Commit: ``` commit 5e0545ef5682562ffef072138d9340ea36a2ebc9 Author: Jason Wang Date: Tue Jul 25 03:20:49 2023 -0400 virtio-net: fix race between set queues and probe BugLink: https://bugs.launchpad.net/bugs/2035400 commit 25266128fe16d5632d43ada34c847d7b8daba539 upstream. A race were found where set_channels could be called after registering but before virtnet_set_queues() in virtnet_probe(). Fixing this by moving the virtnet_set_queues() before netdevice registering. While at it, use _virtnet_set_queues() to avoid holding rtnl as the device is not even registered at that time. Cc: sta...@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Reviewed-by: Xuan Zhuo Link: https://lore.kernel.org/r/20230725072049.617289-1-jasow...@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman Signed-off-by: Kamal Mostafa Signed-off-by: Stefan Bader ``` Investigation into Greg KH's 5.15 branch shows the (unapplied?) followup as: ``` commit 431db3f48c286462ad7453ccdf284f590aafa949 Author: Jason Wang Date: Wed Aug 9 23:12:56 2023 -0400 virtio-net: set queues after driver_ok commit 51b813176f098ff61bd2833f627f5319ead098a5 upstream. Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea Acked-by: Michael S. Tsirkin Signed-off-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman ``` Boot stack trace: ``` [ 28.129660] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [systemd-udevd:165] [ 28.130265] Modules linked in: crct10dif_pclmul