Hi David,
Thanks for the link, I think that is the most plausible explanation I have
seen so far.
The only problem is, if we look at the patch:
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7a3ab3427369..24001112c323 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
if (tun)
xdp_rxq_info_unreg(&tfile->xdp_rxq);
ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
- sock_put(&tfile->sk);
}
}
@@ -702,6 +701,9 @@ static void tun_detach(struct tun_file *tfile, bool clean)
if (dev)
netdev_state_change(dev);
rtnl_unlock();
+
+ if (clean)
+ sock_put(&tfile->sk);
}
static void tun_detach_all(struct net_device *dev)
It moves the final sock_put(&tfile->sk) from the end of __tun_detach()
to tun_detach(), after the call to netdev_state_change(dev).
685 static void __tun_detach(struct tun_file *tfile, bool clean)
686 {
...
725 if (clean) {
726 if (tun && tun->numqueues == 0 && tun->numdisabled == 0) {
727 netif_carrier_off(tun->dev);
728
729 if (!(tun->flags & IFF_PERSIST) &&
730 tun->dev->reg_state == NETREG_REGISTERED)
731 unregister_netdevice(tun->dev);
732 }
733 if (tun)
734 xdp_rxq_info_unreg(&tfile->xdp_rxq);
735 ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
736 sock_put(&tfile->sk);
737 }
738 }
739
740 static void tun_detach(struct tun_file *tfile, bool clean)
741 {
742 struct tun_struct *tun;
743 struct net_device *dev;
744
745 rtnl_lock();
746 tun = rtnl_dereference(tfile->tun);
747 dev = tun ? tun->dev : NULL;
748 __tun_detach(tfile, clean);
749 if (dev)
750 netdev_state_change(dev);
751 rtnl_unlock();
752 }
This more or less makes sense, but if you look at the call trace in the bug:
...
[455151.894444] notifier_call_chain+0x55/0x80
...
[455151.895239] unregister_netdevice_queue+0x94/0x120
[455151.895383] __tun_detach+0x421/0x430
...
$ eu-addr2line -ifae ./vmlinux-5.4.0-88-generic __tun_detach+0x421
0xffffffff8178b991
unregister_netdevice inlined at
/build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5 in __tun_detach
/build/linux-q2DMsi/linux-5.4.0/include/linux/netdevice.h:2677:1
__tun_detach
/build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5
We get to notifier_call_chain() not from netdev_state_change() as
mentioned in the bug report, but unregister_netdevice() from line 731.
This means we haven't yet run sock_put(&tfile->sk) from line 736.
Puzzling isn't it? There are calls to sock_put(&tfile->sk) earlier in
__tun_detach(), maybe it freed the socket buffer already, which would
explain the behaviour.
But then when we run sock_put(&tfile->sk) again, wouldn't we then run
into use-after-free territory, when we try free the socket buffer again?
1735 /* Ungrab socket and destroy it, if it was the last reference. */
1736 static inline void sock_put(struct sock *sk)
1737 {
1738 if (refcount_dec_and_test(&sk->sk_refcnt))
1739 sk_free(sk);
1740 }
I have a second call trace that I have been debugging along with the one
in the description, I'll add it in the next comment.
I'll keep looking into the patch anyway. I have been running the
syzkaller reproducer in a VM for the last few hours, but I haven't
reproduced yet.
https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe
https://groups.google.com/g/syzkaller-bugs/c/C0r0nwrvBME/m/MxQ5Z7_VBAAJ
https://syzkaller.appspot.com/x/repro.c?x=11bd3a10f00000
Thanks,
Matthew
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1962485
Title:
Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]
Status in linux package in Ubuntu:
Confirmed
Bug description:
Hi,
I am running openstack xena release on ubuntu focal. Today my compute
node running ubuntu focal crashed with due to kernel and dump has been
generated in /var/crash/. Below is the kernel trace in crash dump.
[455151.890114] general protection fault: 0000 [#1] SMP NOPTI
[455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded
Tainted: G OE 5.4.0-88-generic #99-Ubuntu
[455151.890612] Hardware name: Dell Inc. PowerEdge R6525/XXXXX, BIOS 2.5.6
10/06/2021
[455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
[455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83
3f 00 74 25 e8 cf ff ff ff 41
01 c5 48 83 c3 40 48 83 3b 00
[455151.891552] RSP: 0018:ffffa6b477487b88 EFLAGS: 00010202
[455151.891707] RAX: 0000000000000000 RBX: ffff9387c594f280 RCX:
0000000000000000
[455151.891918] RDX: 0000000000000060 RSI: ffff9390702a72c0 RDI:
0314a8c0f1b16f3e
[455151.892130] RBP: ffffa6b477487ba0 R08: 0000000000000000 R09:
ffffffffbc6ed7f0
[455151.892341] R10: ffffa6b477487cd0 R11: 0000000000000001 R12:
0000000000000000
[455151.892552] R13: 0000000000000000 R14: ffff9391e5684000 R15:
ffffffffbd5f9880
[455151.892767] FS: 00007f69950c75c0(0000) GS:ffff9391feac0000(0000)
knlGS:0000000000000000
[455151.893016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[455151.893207] CR2: 00007f61e9e45000 CR3: 0000017c54afa000 CR4:
0000000000340ee0
[455151.893434] Call Trace:
[455151.893514] count_subheaders.part.0+0x31/0x60
[455151.893646] unregister_sysctl_table+0x30/0x90
[455151.893781] unregister_net_sysctl_table+0xe/0x10
[455151.893922] __devinet_sysctl_unregister.isra.0+0x2c/0x60
[455151.894082] devinet_sysctl_unregister+0x29/0x40
[455151.894220] inetdev_event+0x1e8/0x560
[455151.894334] ? skb_dequeue+0x5f/0x70
[455151.894444] notifier_call_chain+0x55/0x80
[455151.894565] ? notifier_call_chain+0x55/0x80
[455151.894693] raw_notifier_call_chain+0x16/0x20
[455151.894829] call_netdevice_notifiers_info+0x2e/0x60
[455151.894983] ? tun_show_owner+0x60/0x60
[455151.895098] rollback_registered_many+0x36e/0x520
[455151.895239] unregister_netdevice_queue+0x94/0x120
[455151.895383] __tun_detach+0x421/0x430
[455151.895495] tun_chr_close+0x3a/0x70
[455151.895605] __fput+0xcc/0x260
[455151.895698] ____fput+0xe/0x10
[455151.895792] task_work_run+0x8f/0xb0
[455151.895903] exit_to_usermode_loop+0x131/0x160
[455151.896036] do_syscall_64+0x163/0x190
[455151.896150] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[455151.896302] RIP: 0033:0x7f69965ba3fb
[455151.896410] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec
18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d
00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44
[455151.896975] RSP: 002b:00007ffdff14b350 EFLAGS: 00000293 ORIG_RAX:
0000000000000003
[455151.897201] RAX: 0000000000000000 RBX: 0000557fe0875e50 RCX:
00007f69965ba3fb
[455151.897412] RDX: 0000557fe0748f40 RSI: 0000000000000001 RDI:
000000000000002b
[455151.897637] RBP: 0000557fe0887460 R08: 0000000000000000 R09:
0000000000000000
[455151.904390] R10: 0000000000000032 R11: 0000000000000293 R12:
0000557fe0875e50
[455151.911165] R13: 0000000000000001 R14: 0000557fe09efc10 R15:
0000557fe0747900
I didn't find any documented details on kernel 5.4 for this bug. I
have uploaded the logs via ubuntu-bug linux command.
# uname -a
Linux kvm03-a1-r01-khi04.rapid.pk 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23
17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
# cat /proc/version_signature
Ubuntu 5.4.0-88.99-generic 5.4.140
I am using Dell R6525 with EPYC 7532 CPUs.
Let me know if there is there are more information needed.
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.4.0-88-generic 5.4.0-88.99
ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
Uname: Linux 5.4.0-88-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Feb 28 17:38 seq
crw-rw---- 1 root audio 116, 33 Feb 28 17:38 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.20
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
CasperMD5CheckResult: pass
Date: Mon Feb 28 21:20:20 2022
InstallationDate: Installed on 2021-07-29 (214 days ago)
InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64
(20210201.2)
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: Dell Inc. PowerEdge R6525
PciMultimedia:
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-88-generic
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro iommu=pt intel_iommu=on swapaccount=1
vga=normal nofb nomodeset video=vesafb:off i915.modeset=0 crashkernel=512M
RelatedPackageVersions:
linux-restricted-modules-5.4.0-88-generic N/A
linux-backports-modules-5.4.0-88-generic N/A
linux-firmware 1.187.19
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 10/06/2021
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.5.6
dmi.board.name: 0GK70M
dmi.board.vendor: Dell Inc.
dmi.board.version: A10
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias:
dmi:bvnDellInc.:bvr2.5.6:bd10/06/2021:svnDellInc.:pnPowerEdgeR6525:pvr:rvnDellInc.:rn0GK70M:rvrA10:cvnDellInc.:ct23:cvr:
dmi.product.family: PowerEdge
dmi.product.name: PowerEdge R6525
dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6525
dmi.sys.vendor: Dell Inc.
---
ProblemType: Bug
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Feb 28 17:38 seq
crw-rw---- 1 root audio 116, 33 Feb 28 17:38 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27.20
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
CasperMD5CheckResult: pass
DistroRelease: Ubuntu 20.04
InstallationDate: Installed on 2021-07-29 (214 days ago)
InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64
(20210201.2)
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: Dell Inc. PowerEdge R6525
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-88-generic
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro iommu=pt intel_iommu=on swapaccount=1
vga=normal nofb nomodeset video=vesafb:off i915.modeset=0 crashkernel=512M
ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
RelatedPackageVersions:
linux-restricted-modules-5.4.0-88-generic N/A
linux-backports-modules-5.4.0-88-generic N/A
linux-firmware 1.187.19
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
Tags: focal uec-images
Uname: Linux 5.4.0-88-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: N/A
_MarkForUpload: True
dmi.bios.date: 10/06/2021
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.5.6
dmi.board.name: 0GK70M
dmi.board.vendor: Dell Inc.
dmi.board.version: A10
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias:
dmi:bvnDellInc.:bvr2.5.6:bd10/06/2021:svnDellInc.:pnPowerEdgeR6525:pvr:rvnDellInc.:rn0GK70M:rvrA10:cvnDellInc.:ct23:cvr:
dmi.product.family: PowerEdge
dmi.product.name: PowerEdge R6525
dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6525
dmi.sys.vendor: Dell Inc.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1962485/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp