[Kernel-packages] [Bug 1962485] Re: Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]

2023-02-16 Thread Matthew Ruffell
ovs-vsctl[51186]: ovs|1|vsctl|INFO|Called as ovs-vsctl --timeout=120 
--oneline --format=json --db=tcp:127.0.0.1:6640 -- --if-exists del-port br-int 
tap8c883ee5-5f
kernel: device tap8c883ee5-5f left promiscuous mode
lldpd[2309]: removal request for address of fe80::fc16:3eff:fe07:2be2%27, but 
no knowledge of it
systemd-networkd[1608]: tap8c883ee5-5f: Link DOWN
systemd-networkd[1608]: tap8c883ee5-5f: Lost carrier
kernel: general protection fault:  [#1] SMP NOPTI
kernel: CPU: 41 PID: 25064 Comm: privsep-helper Tainted: GW 
5.4.0-81-generic #91~18.04.1-Ubuntu
kernel: Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 
07/16/2020
kernel: RIP: 0010:count_subheaders.part.15+0x41/0x60
kernel: Code: 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 75 1b 41 bc 01 00 00 00 48 
83 c3 40 48 83 3b 00 75 e7 43 8d 04 2c 5b 41 5c 41 5d 5d c3 <48> 83 3f 00 b8 01 
00 00 00 74 05 e8 af ff ff ff 41 01 c5 eb d6 31
kernel: RSP: 0018:a8fa4d0437a0 EFLAGS: 00010286
kernel: RAX: 0001 RBX: 98cfefac0800 RCX: 
kernel: RDX: 001b RSI: 98f85782cac0 RDI: 8c0a25048c0abfe4
kernel: RBP: a8fa4d0437b8 R08:  R09: 000a
kernel: R10: a8fa4d0438e8 R11: 00031220 R12: 
kernel: R13:  R14: 98cff12e R15: 98f85782ca00
kernel: FS:  7f7f3f9f9700() GS:98e05fb4() 
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 80050033
kernel: CR2: 0225b7f8 CR3: 00248bf1c006 CR4: 007626e0
kernel: DR0:  DR1:  DR2: 
kernel: DR3:  DR6: fffe0ff0 DR7: 0400
kernel: PKRU: 5554
kernel: Call Trace:
kernel:  count_subheaders.part.15+0x51/0x60
kernel:  unregister_sysctl_table+0x31/0xb0
kernel:  unregister_net_sysctl_table+0xe/0x10
kernel:  __devinet_sysctl_unregister.isra.25+0x2b/0x50
kernel:  devinet_sysctl_unregister+0x29/0x40
kernel:  inetdev_event+0x1f0/0x570
kernel:  ? skb_dequeue+0x60/0x70
kernel:  notifier_call_chain+0x4c/0x70
kernel:  ? notifier_call_chain+0x4c/0x70
kernel:  ? tun_show_group+0x60/0x60
kernel:  raw_notifier_call_chain+0x16/0x20
kernel:  call_netdevice_notifiers_info+0x2d/0x60
kernel:  rollback_registered_many+0x346/0x520
kernel:  ? mem_cgroup_throttle_swaprate+0x1d/0x140
kernel:  unregister_netdevice_many.part.127+0x12/0x90
kernel:  unregister_netdevice_many+0x16/0x20
kernel:  rtnl_delete_link+0x4e/0x80
kernel:  rtnl_dellink+0x12d/0x2b0
kernel:  ? __nla_parse+0x22/0x30
kernel:  ? rtnl_dump_ifinfo+0x360/0x5d0
kernel:  ? ns_capable+0x10/0x20
kernel:  rtnetlink_rcv_msg+0x296/0x340
kernel:  ? aa_label_sk_perm.part.4+0x10f/0x160
kernel:  ? _cond_resched+0x19/0x40
kernel:  ? rtnl_calcit.isra.30+0x120/0x120
kernel:  netlink_rcv_skb+0x51/0x120
kernel:  rtnetlink_rcv+0x15/0x20
kernel:  netlink_unicast+0x1a4/0x250
kernel:  netlink_sendmsg+0x2eb/0x3f0
kernel:  sock_sendmsg+0x63/0x70
kernel:  __sys_sendto+0x13f/0x180
kernel:  ? handle_mm_fault+0xcb/0x210
kernel:  ? __do_page_fault+0x2be/0x4d0
kernel:  __x64_sys_sendto+0x28/0x30
kernel:  do_syscall_64+0x57/0x190
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1962485

Title:
  Kernel Crash [general protection fault:  [#1] SMP NOPTI]

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  I am running openstack xena release on ubuntu focal. Today my compute
  node running ubuntu focal crashed with due to kernel and dump has been
  generated in /var/crash/. Below is the kernel trace in crash dump.

  [455151.890114] general protection fault:  [#1] SMP NOPTI
  [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded 
Tainted: G   OE 5.4.0-88-generic #99-Ubuntu
  [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/X, BIOS 2.5.6 
10/06/2021
  [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
  [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 
3f 00 74 25 e8 cf ff ff ff 41 
  01 c5 48 83 c3 40 48 83 3b 00
  [455151.891552] RSP: 0018:a6b477487b88 EFLAGS: 00010202
  [455151.891707] RAX:  RBX: 9387c594f280 RCX: 

  [455151.891918] RDX: 0060 RSI: 9390702a72c0 RDI: 
0314a8c0f1b16f3e
  [455151.892130] RBP: a6b477487ba0 R08:  R09: 
bc6ed7f0
  [455151.892341] R10: a6b477487cd0 R11: 0001 R12: 

  [455151.892552] R13:  R14: 9391e5684000 R15: 
bd5f9880
  [455151.892767] FS:  7f69950c75c0() GS:9391feac() 
knlGS:
  [455151.893016] CS:  0010 DS:  ES:  CR0: 80050033
  [455151.893207] CR2: 

[Kernel-packages] [Bug 1962485] Re: Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]

2023-02-16 Thread Matthew Ruffell
Hi David,

Thanks for the link, I think that is the most plausible explanation I have
seen so far.

The only problem is, if we look at the patch:

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7a3ab3427369..24001112c323 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
if (tun)
xdp_rxq_info_unreg(>xdp_rxq);
ptr_ring_cleanup(>tx_ring, tun_ptr_free);
-   sock_put(>sk);
}
 }
 
@@ -702,6 +701,9 @@ static void tun_detach(struct tun_file *tfile, bool clean)
if (dev)
netdev_state_change(dev);
rtnl_unlock();
+
+   if (clean)
+   sock_put(>sk);
 }
 
 static void tun_detach_all(struct net_device *dev)

It moves the final sock_put(>sk) from the end of __tun_detach()
to tun_detach(), after the call to netdev_state_change(dev).

 685 static void __tun_detach(struct tun_file *tfile, bool clean)
 686 {
...
 725 if (clean) {
 726 if (tun && tun->numqueues == 0 && tun->numdisabled == 0) {
 727 netif_carrier_off(tun->dev);
 728 
 729 if (!(tun->flags & IFF_PERSIST) &&
 730 tun->dev->reg_state == NETREG_REGISTERED)
 731 unregister_netdevice(tun->dev);
 732 }
 733 if (tun)
 734 xdp_rxq_info_unreg(>xdp_rxq);
 735 ptr_ring_cleanup(>tx_ring, tun_ptr_free);
 736 sock_put(>sk);
 737 }
 738 }
 739 
 740 static void tun_detach(struct tun_file *tfile, bool clean)
 741 {
 742 struct tun_struct *tun;
 743 struct net_device *dev;
 744 
 745 rtnl_lock();
 746 tun = rtnl_dereference(tfile->tun);
 747 dev = tun ? tun->dev : NULL;
 748 __tun_detach(tfile, clean);
 749 if (dev)
 750 netdev_state_change(dev);
 751 rtnl_unlock();
 752 }
 
This more or less makes sense, but if you look at the call trace in the bug:

...
[455151.89] notifier_call_chain+0x55/0x80
...
[455151.895239] unregister_netdevice_queue+0x94/0x120
[455151.895383] __tun_detach+0x421/0x430
...

$ eu-addr2line -ifae ./vmlinux-5.4.0-88-generic  __tun_detach+0x421
0x8178b991
unregister_netdevice inlined at 
/build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5 in __tun_detach
/build/linux-q2DMsi/linux-5.4.0/include/linux/netdevice.h:2677:1
__tun_detach
/build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5

We get to notifier_call_chain() not from netdev_state_change() as
mentioned in the bug report, but unregister_netdevice() from line 731.
This means we haven't yet run sock_put(>sk) from line 736.

Puzzling isn't it? There are calls to sock_put(>sk) earlier in
__tun_detach(), maybe it freed the socket buffer already, which would
explain the behaviour.

But then when we run sock_put(>sk) again, wouldn't we then run
into use-after-free territory, when we try free the socket buffer again?

1735 /* Ungrab socket and destroy it, if it was the last reference. */
1736 static inline void sock_put(struct sock *sk)
1737 {
1738 if (refcount_dec_and_test(>sk_refcnt))
1739 sk_free(sk);
1740 }

I have a second call trace that I have been debugging along with the one
in the description, I'll add it in the next comment.

I'll keep looking into the patch anyway. I have been running the
syzkaller reproducer in a VM for the last few hours, but I haven't
reproduced yet.

https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe
https://groups.google.com/g/syzkaller-bugs/c/C0r0nwrvBME/m/MxQ5Z7_VBAAJ
https://syzkaller.appspot.com/x/repro.c?x=11bd3a10f0

Thanks,
Matthew

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1962485

Title:
  Kernel Crash [general protection fault:  [#1] SMP NOPTI]

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  I am running openstack xena release on ubuntu focal. Today my compute
  node running ubuntu focal crashed with due to kernel and dump has been
  generated in /var/crash/. Below is the kernel trace in crash dump.

  [455151.890114] general protection fault:  [#1] SMP NOPTI
  [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded 
Tainted: G   OE 5.4.0-88-generic #99-Ubuntu
  [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/X, BIOS 2.5.6 
10/06/2021
  [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
  [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 
3f 00 74 25 e8 cf ff ff ff 41 
  01 c5 48 83 c3 40 48 83 3b 00
  [455151.891552] RSP: 0018:a6b477487b88 EFLAGS: 00010202
  [455151.891707] RAX:  RBX: 9387c594f280 RCX: 

  [455151.891918] RDX: 0060 RSI: 9390702a72c0 RDI: 
0314a8c0f1b16f3e
  

[Kernel-packages] [Bug 1962485] Re: Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]

2023-02-16 Thread David Hill
Maybe the same as
https://lore.kernel.org/lkml/CANn89iJxiV_-g6n60aeA=mO=DYwGV9VdJswHP4pc-
vwq_ug...@mail.gmail.com/T/ ?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1962485

Title:
  Kernel Crash [general protection fault:  [#1] SMP NOPTI]

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  I am running openstack xena release on ubuntu focal. Today my compute
  node running ubuntu focal crashed with due to kernel and dump has been
  generated in /var/crash/. Below is the kernel trace in crash dump.

  [455151.890114] general protection fault:  [#1] SMP NOPTI
  [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded 
Tainted: G   OE 5.4.0-88-generic #99-Ubuntu
  [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/X, BIOS 2.5.6 
10/06/2021
  [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
  [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 
3f 00 74 25 e8 cf ff ff ff 41 
  01 c5 48 83 c3 40 48 83 3b 00
  [455151.891552] RSP: 0018:a6b477487b88 EFLAGS: 00010202
  [455151.891707] RAX:  RBX: 9387c594f280 RCX: 

  [455151.891918] RDX: 0060 RSI: 9390702a72c0 RDI: 
0314a8c0f1b16f3e
  [455151.892130] RBP: a6b477487ba0 R08:  R09: 
bc6ed7f0
  [455151.892341] R10: a6b477487cd0 R11: 0001 R12: 

  [455151.892552] R13:  R14: 9391e5684000 R15: 
bd5f9880
  [455151.892767] FS:  7f69950c75c0() GS:9391feac() 
knlGS:
  [455151.893016] CS:  0010 DS:  ES:  CR0: 80050033
  [455151.893207] CR2: 7f61e9e45000 CR3: 017c54afa000 CR4: 
00340ee0
  [455151.893434] Call Trace:
  [455151.893514]  count_subheaders.part.0+0x31/0x60
  [455151.893646]  unregister_sysctl_table+0x30/0x90
  [455151.893781]  unregister_net_sysctl_table+0xe/0x10
  [455151.893922]  __devinet_sysctl_unregister.isra.0+0x2c/0x60
  [455151.894082]  devinet_sysctl_unregister+0x29/0x40
  [455151.894220]  inetdev_event+0x1e8/0x560
  [455151.894334]  ? skb_dequeue+0x5f/0x70
  [455151.89]  notifier_call_chain+0x55/0x80
  [455151.894565]  ? notifier_call_chain+0x55/0x80
  [455151.894693]  raw_notifier_call_chain+0x16/0x20
  [455151.894829]  call_netdevice_notifiers_info+0x2e/0x60
  [455151.894983]  ? tun_show_owner+0x60/0x60
  [455151.895098]  rollback_registered_many+0x36e/0x520
  [455151.895239]  unregister_netdevice_queue+0x94/0x120
  [455151.895383]  __tun_detach+0x421/0x430
  [455151.895495]  tun_chr_close+0x3a/0x70
  [455151.895605]  __fput+0xcc/0x260
  [455151.895698]  fput+0xe/0x10
  [455151.895792]  task_work_run+0x8f/0xb0
  [455151.895903]  exit_to_usermode_loop+0x131/0x160
  [455151.896036]  do_syscall_64+0x163/0x190
  [455151.896150]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [455151.896302] RIP: 0033:0x7f69965ba3fb
  [455151.896410] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 
18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 
00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44
  [455151.896975] RSP: 002b:7ffdff14b350 EFLAGS: 0293 ORIG_RAX: 
0003
  [455151.897201] RAX:  RBX: 557fe0875e50 RCX: 
7f69965ba3fb
  [455151.897412] RDX: 557fe0748f40 RSI: 0001 RDI: 
002b
  [455151.897637] RBP: 557fe0887460 R08:  R09: 

  [455151.904390] R10: 0032 R11: 0293 R12: 
557fe0875e50
  [455151.911165] R13: 0001 R14: 557fe09efc10 R15: 
557fe0747900

  I didn't find any documented details on kernel 5.4 for this bug. I
  have uploaded the logs via ubuntu-bug linux command.

  # uname -a
  Linux kvm03-a1-r01-khi04.rapid.pk 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 
17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  # cat /proc/version_signature
  Ubuntu 5.4.0-88.99-generic 5.4.140

  I am using Dell R6525 with EPYC 7532 CPUs.

  Let me know if there is there are more information needed.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-88-generic 5.4.0-88.99
  ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
  Uname: Linux 5.4.0-88-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Feb 28 17:38 seq
   crw-rw 1 root audio 116, 33 Feb 28 17:38 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.20
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CasperMD5CheckResult: pass
  Date: Mon Feb 28 

[Kernel-packages] [Bug 1962485] Re: Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]

2022-10-05 Thread Matthew Ruffell
** Tags added: sts

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1962485

Title:
  Kernel Crash [general protection fault:  [#1] SMP NOPTI]

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  I am running openstack xena release on ubuntu focal. Today my compute
  node running ubuntu focal crashed with due to kernel and dump has been
  generated in /var/crash/. Below is the kernel trace in crash dump.

  [455151.890114] general protection fault:  [#1] SMP NOPTI
  [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded 
Tainted: G   OE 5.4.0-88-generic #99-Ubuntu
  [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/X, BIOS 2.5.6 
10/06/2021
  [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
  [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 
3f 00 74 25 e8 cf ff ff ff 41 
  01 c5 48 83 c3 40 48 83 3b 00
  [455151.891552] RSP: 0018:a6b477487b88 EFLAGS: 00010202
  [455151.891707] RAX:  RBX: 9387c594f280 RCX: 

  [455151.891918] RDX: 0060 RSI: 9390702a72c0 RDI: 
0314a8c0f1b16f3e
  [455151.892130] RBP: a6b477487ba0 R08:  R09: 
bc6ed7f0
  [455151.892341] R10: a6b477487cd0 R11: 0001 R12: 

  [455151.892552] R13:  R14: 9391e5684000 R15: 
bd5f9880
  [455151.892767] FS:  7f69950c75c0() GS:9391feac() 
knlGS:
  [455151.893016] CS:  0010 DS:  ES:  CR0: 80050033
  [455151.893207] CR2: 7f61e9e45000 CR3: 017c54afa000 CR4: 
00340ee0
  [455151.893434] Call Trace:
  [455151.893514]  count_subheaders.part.0+0x31/0x60
  [455151.893646]  unregister_sysctl_table+0x30/0x90
  [455151.893781]  unregister_net_sysctl_table+0xe/0x10
  [455151.893922]  __devinet_sysctl_unregister.isra.0+0x2c/0x60
  [455151.894082]  devinet_sysctl_unregister+0x29/0x40
  [455151.894220]  inetdev_event+0x1e8/0x560
  [455151.894334]  ? skb_dequeue+0x5f/0x70
  [455151.89]  notifier_call_chain+0x55/0x80
  [455151.894565]  ? notifier_call_chain+0x55/0x80
  [455151.894693]  raw_notifier_call_chain+0x16/0x20
  [455151.894829]  call_netdevice_notifiers_info+0x2e/0x60
  [455151.894983]  ? tun_show_owner+0x60/0x60
  [455151.895098]  rollback_registered_many+0x36e/0x520
  [455151.895239]  unregister_netdevice_queue+0x94/0x120
  [455151.895383]  __tun_detach+0x421/0x430
  [455151.895495]  tun_chr_close+0x3a/0x70
  [455151.895605]  __fput+0xcc/0x260
  [455151.895698]  fput+0xe/0x10
  [455151.895792]  task_work_run+0x8f/0xb0
  [455151.895903]  exit_to_usermode_loop+0x131/0x160
  [455151.896036]  do_syscall_64+0x163/0x190
  [455151.896150]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [455151.896302] RIP: 0033:0x7f69965ba3fb
  [455151.896410] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 
18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 
00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44
  [455151.896975] RSP: 002b:7ffdff14b350 EFLAGS: 0293 ORIG_RAX: 
0003
  [455151.897201] RAX:  RBX: 557fe0875e50 RCX: 
7f69965ba3fb
  [455151.897412] RDX: 557fe0748f40 RSI: 0001 RDI: 
002b
  [455151.897637] RBP: 557fe0887460 R08:  R09: 

  [455151.904390] R10: 0032 R11: 0293 R12: 
557fe0875e50
  [455151.911165] R13: 0001 R14: 557fe09efc10 R15: 
557fe0747900

  I didn't find any documented details on kernel 5.4 for this bug. I
  have uploaded the logs via ubuntu-bug linux command.

  # uname -a
  Linux kvm03-a1-r01-khi04.rapid.pk 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 
17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  # cat /proc/version_signature
  Ubuntu 5.4.0-88.99-generic 5.4.140

  I am using Dell R6525 with EPYC 7532 CPUs.

  Let me know if there is there are more information needed.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-88-generic 5.4.0-88.99
  ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
  Uname: Linux 5.4.0-88-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Feb 28 17:38 seq
   crw-rw 1 root audio 116, 33 Feb 28 17:38 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.20
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CasperMD5CheckResult: pass
  Date: Mon Feb 28 21:20:20 2022
  InstallationDate: Installed on 2021-07-29 (214 days ago)
  InstallationMedia: 

[Kernel-packages] [Bug 1962485] Re: Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]

2022-02-28 Thread Ammad Ali
apport information

** Tags added: apport-collected

** Description changed:

  Hi,
  
  I am running openstack xena release on ubuntu focal. Today my compute
  node running ubuntu focal crashed with due to kernel and dump has been
  generated in /var/crash/. Below is the kernel trace in crash dump.
  
  [455151.890114] general protection fault:  [#1] SMP NOPTI
  [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded 
Tainted: G   OE 5.4.0-88-generic #99-Ubuntu
  [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/X, BIOS 2.5.6 
10/06/2021
  [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
  [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 
3f 00 74 25 e8 cf ff ff ff 41 
  01 c5 48 83 c3 40 48 83 3b 00
  [455151.891552] RSP: 0018:a6b477487b88 EFLAGS: 00010202
  [455151.891707] RAX:  RBX: 9387c594f280 RCX: 

  [455151.891918] RDX: 0060 RSI: 9390702a72c0 RDI: 
0314a8c0f1b16f3e
  [455151.892130] RBP: a6b477487ba0 R08:  R09: 
bc6ed7f0
  [455151.892341] R10: a6b477487cd0 R11: 0001 R12: 

  [455151.892552] R13:  R14: 9391e5684000 R15: 
bd5f9880
  [455151.892767] FS:  7f69950c75c0() GS:9391feac() 
knlGS:
  [455151.893016] CS:  0010 DS:  ES:  CR0: 80050033
  [455151.893207] CR2: 7f61e9e45000 CR3: 017c54afa000 CR4: 
00340ee0
  [455151.893434] Call Trace:
  [455151.893514]  count_subheaders.part.0+0x31/0x60
  [455151.893646]  unregister_sysctl_table+0x30/0x90
  [455151.893781]  unregister_net_sysctl_table+0xe/0x10
  [455151.893922]  __devinet_sysctl_unregister.isra.0+0x2c/0x60
  [455151.894082]  devinet_sysctl_unregister+0x29/0x40
  [455151.894220]  inetdev_event+0x1e8/0x560
  [455151.894334]  ? skb_dequeue+0x5f/0x70
  [455151.89]  notifier_call_chain+0x55/0x80
  [455151.894565]  ? notifier_call_chain+0x55/0x80
  [455151.894693]  raw_notifier_call_chain+0x16/0x20
  [455151.894829]  call_netdevice_notifiers_info+0x2e/0x60
  [455151.894983]  ? tun_show_owner+0x60/0x60
  [455151.895098]  rollback_registered_many+0x36e/0x520
  [455151.895239]  unregister_netdevice_queue+0x94/0x120
  [455151.895383]  __tun_detach+0x421/0x430
  [455151.895495]  tun_chr_close+0x3a/0x70
  [455151.895605]  __fput+0xcc/0x260
  [455151.895698]  fput+0xe/0x10
  [455151.895792]  task_work_run+0x8f/0xb0
  [455151.895903]  exit_to_usermode_loop+0x131/0x160
  [455151.896036]  do_syscall_64+0x163/0x190
  [455151.896150]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [455151.896302] RIP: 0033:0x7f69965ba3fb
  [455151.896410] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 
18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 
00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44
  [455151.896975] RSP: 002b:7ffdff14b350 EFLAGS: 0293 ORIG_RAX: 
0003
  [455151.897201] RAX:  RBX: 557fe0875e50 RCX: 
7f69965ba3fb
  [455151.897412] RDX: 557fe0748f40 RSI: 0001 RDI: 
002b
  [455151.897637] RBP: 557fe0887460 R08:  R09: 

  [455151.904390] R10: 0032 R11: 0293 R12: 
557fe0875e50
  [455151.911165] R13: 0001 R14: 557fe09efc10 R15: 
557fe0747900
  
  I didn't find any documented details on kernel 5.4 for this bug. I have
  uploaded the logs via ubuntu-bug linux command.
  
  # uname -a
  Linux kvm03-a1-r01-khi04.rapid.pk 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 
17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  
  # cat /proc/version_signature
  Ubuntu 5.4.0-88.99-generic 5.4.140
  
  I am using Dell R6525 with EPYC 7532 CPUs.
  
  Let me know if there is there are more information needed.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-88-generic 5.4.0-88.99
  ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
  Uname: Linux 5.4.0-88-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Feb 28 17:38 seq
   crw-rw 1 root audio 116, 33 Feb 28 17:38 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.20
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CasperMD5CheckResult: pass
  Date: Mon Feb 28 21:20:20 2022
  InstallationDate: Installed on 2021-07-29 (214 days ago)
  InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64 
(20210201.2)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R6525
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm-256color