[Kernel-packages] [Bug 1989521] Re: Ubuntu 22.04.1 CPU soft lockup occurs repeatedly

2022-10-05 Thread Marcus Yanello
For me it started after installing gnome on my Ubuntu server. Suspicion
is that the desktop sets up suspend mode, though since it is used as a
server it causes issues for any server programs that need to not be
suspended

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1989521

Title:
  Ubuntu 22.04.1 CPU soft lockup occurs repeatedly

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi all,

  Ubuntu server 22.04.1 is having issues freezing repeatedly with CPU
  softlocking. The issue seems to have started in the last week, all
  packages are up to date. I've updated to hwe kernel, rebooted several
  times, and it still happens. Hw info: 32G RAM, AMD 3600x CPU, Quadro
  RTX 4000 GPU.

  I caught the following in syslog :

  
  Sep 13 04:17:55 marcus-server kernel: [33687.436241] watchdog: BUG: soft 
lockup - CPU#2 stuck for 26s! [kworker/u64:17:154214]
  Sep 13 04:17:55 marcus-server kernel: [33687.436243] Modules linked in: tls 
xt_nat veth nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
nft_counter nf_tables nfnetlink overlay bridge stp llc nvidia_drm(PO) 
snd_hda_codec_realtek intel_rapl_msr intel_rapl_common nvidia_modeset(PO) 
snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel 
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep 
snd_pcm snd_seq_midi snd_seq_midi_event zfs(PO) edac_mce_amd nls_iso8859_1 
snd_rawmidi kvm_amd zunicode(PO) nvidia(PO) snd_seq kvm zzstd(O) zlua(O) 
zavl(PO) snd_seq_device icp(PO) rapl wmi_bmof snd_timer zcommon(PO) k10temp ccp 
ucsi_ccg znvpair(PO) snd typec_ucsi typec spl(O) soundcore apex(OE) gasket(OE) 
mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua nct6775 
hwmon_vid ipmi_devi
 ntf ipmi_msghandler msr parport_pc ppdev lp
  Sep 13 04:17:55 marcus-server kernel: [33687.436279]  parport ramoops 
reed_solomon pstore_blk pstore_zone mtd efi_pstore ip_tables x_tables autofs4 
btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
multipath linear nouveau mxm_wmi drm_ttm_helper ttm drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops cec rc_core crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel drm aesni_intel video crypto_simd igb cryptd xhci_pci ahci 
dca i2c_piix4 i2c_nvidia_gpu arcmsr libahci xhci_pci_renesas i2c_algo_bit wmi
  Sep 13 04:17:55 marcus-server kernel: [33687.436302] CPU: 2 PID: 154214 Comm: 
kworker/u64:17 Tainted: P   OE 5.15.0-47-generic #51-Ubuntu
  Sep 13 04:17:55 marcus-server kernel: [33687.436303] Hardware name: To Be 
Filled By O.E.M. To Be Filled By O.E.M./X570 Phantom Gaming 4, BIOS P4.20 
08/02/2021
  Sep 13 04:17:55 marcus-server kernel: [33687.436305] Workqueue: 
events_unbound async_run_entry_fn
  Sep 13 04:17:55 marcus-server kernel: [33687.436308] RIP: 
0010:arcmsr_wait_firmware_ready+0xc1/0x140 [arcmsr]
  Sep 13 04:17:55 marcus-server kernel: [33687.436312] Code: e3 49 8b 94 24 48 
08 00 00 b8 10 00 00 00 89 02 5b 41 5c 5d e9 b0 7b db e8 48 8b 47 50 4c 8d a0 
bc 00 00 00 eb 0c 41 8b 04 24 <85> c0 0f 88 64 ff ff ff f6 83 81 00 00 00 01 75 
eb bf 14 00 00 00
  Sep 13 04:17:55 marcus-server kernel: [33687.436313] RSP: 
0018:ade8d136fd10 EFLAGS: 0202
  Sep 13 04:17:55 marcus-server kernel: [33687.436314] RAX:  
RBX: 96720a460870 RCX: ade8c12b0034
  Sep 13 04:17:55 marcus-server kernel: [33687.436315] RDX: 000d 
RSI: 96721b53ef80 RDI: 96720a460870
  Sep 13 04:17:55 marcus-server kernel: [33687.436315] RBP: ade8d136fd20 
R08:  R09: 
  Sep 13 04:17:55 marcus-server kernel: [33687.436316] R10: 0284 
R11:  R12: ade8c12b00bc
  Sep 13 04:17:55 marcus-server kernel: [33687.436317] R13: 000d 
R14: 96720a46 R15: 96720a460870
  Sep 13 04:17:55 marcus-server kernel: [33687.436318] FS:  
() GS:96791ea8() knlGS:
  Sep 13 04:17:55 marcus-server kernel: [33687.436319] CS:  0010 DS:  ES: 
 CR0: 80050033
  Sep 13 04:17:55 marcus-server kernel: [33687.436320] CR2:  
CR3: 0007c8c1 CR4: 00350ee0

  
  It happens pretty often too, but the system isn't overloaded, so I'm not sure 
what is causing it. 

  Message from syslogd@marcus-server at Sep 14 02:16:11 ...
   kernel:[ 1276.914096] watchdog: BUG: soft lockup - CPU#8 stuck for 26s! 
[kworker/u64:28:252938]

  Message from syslogd@marcus-server at Sep 14 02:16:11 ...
   kernel:[ 1304.913956] watchdog: BUG: soft lockup - CPU#8 stuck for 52s! 
[kworker/u64:28:252938]

  Message from syslogd@marcus-server at Sep 14 

[Kernel-packages] [Bug 1989521] Re: Ubuntu 22.04.1 CPU soft lockup occurs repeatedly

2022-09-23 Thread Marcus Yanello
Following this: 
https://askubuntu.com/questions/1264859/watchdog-bug-soft-lockup-cpu6-stuck-for-23s
I attempted a BIOS update and verifying my swap was working properly. Also 
reseated my GPU and still no luck.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1989521

Title:
  Ubuntu 22.04.1 CPU soft lockup occurs repeatedly

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi all,

  Ubuntu server 22.04.1 is having issues freezing repeatedly with CPU
  softlocking. The issue seems to have started in the last week, all
  packages are up to date. I've updated to hwe kernel, rebooted several
  times, and it still happens. Hw info: 32G RAM, AMD 3600x CPU, Quadro
  RTX 4000 GPU.

  I caught the following in syslog :

  
  Sep 13 04:17:55 marcus-server kernel: [33687.436241] watchdog: BUG: soft 
lockup - CPU#2 stuck for 26s! [kworker/u64:17:154214]
  Sep 13 04:17:55 marcus-server kernel: [33687.436243] Modules linked in: tls 
xt_nat veth nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter 
xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
nft_counter nf_tables nfnetlink overlay bridge stp llc nvidia_drm(PO) 
snd_hda_codec_realtek intel_rapl_msr intel_rapl_common nvidia_modeset(PO) 
snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel 
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep 
snd_pcm snd_seq_midi snd_seq_midi_event zfs(PO) edac_mce_amd nls_iso8859_1 
snd_rawmidi kvm_amd zunicode(PO) nvidia(PO) snd_seq kvm zzstd(O) zlua(O) 
zavl(PO) snd_seq_device icp(PO) rapl wmi_bmof snd_timer zcommon(PO) k10temp ccp 
ucsi_ccg znvpair(PO) snd typec_ucsi typec spl(O) soundcore apex(OE) gasket(OE) 
mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua nct6775 
hwmon_vid ipmi_devi
 ntf ipmi_msghandler msr parport_pc ppdev lp
  Sep 13 04:17:55 marcus-server kernel: [33687.436279]  parport ramoops 
reed_solomon pstore_blk pstore_zone mtd efi_pstore ip_tables x_tables autofs4 
btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
multipath linear nouveau mxm_wmi drm_ttm_helper ttm drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops cec rc_core crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel drm aesni_intel video crypto_simd igb cryptd xhci_pci ahci 
dca i2c_piix4 i2c_nvidia_gpu arcmsr libahci xhci_pci_renesas i2c_algo_bit wmi
  Sep 13 04:17:55 marcus-server kernel: [33687.436302] CPU: 2 PID: 154214 Comm: 
kworker/u64:17 Tainted: P   OE 5.15.0-47-generic #51-Ubuntu
  Sep 13 04:17:55 marcus-server kernel: [33687.436303] Hardware name: To Be 
Filled By O.E.M. To Be Filled By O.E.M./X570 Phantom Gaming 4, BIOS P4.20 
08/02/2021
  Sep 13 04:17:55 marcus-server kernel: [33687.436305] Workqueue: 
events_unbound async_run_entry_fn
  Sep 13 04:17:55 marcus-server kernel: [33687.436308] RIP: 
0010:arcmsr_wait_firmware_ready+0xc1/0x140 [arcmsr]
  Sep 13 04:17:55 marcus-server kernel: [33687.436312] Code: e3 49 8b 94 24 48 
08 00 00 b8 10 00 00 00 89 02 5b 41 5c 5d e9 b0 7b db e8 48 8b 47 50 4c 8d a0 
bc 00 00 00 eb 0c 41 8b 04 24 <85> c0 0f 88 64 ff ff ff f6 83 81 00 00 00 01 75 
eb bf 14 00 00 00
  Sep 13 04:17:55 marcus-server kernel: [33687.436313] RSP: 
0018:ade8d136fd10 EFLAGS: 0202
  Sep 13 04:17:55 marcus-server kernel: [33687.436314] RAX:  
RBX: 96720a460870 RCX: ade8c12b0034
  Sep 13 04:17:55 marcus-server kernel: [33687.436315] RDX: 000d 
RSI: 96721b53ef80 RDI: 96720a460870
  Sep 13 04:17:55 marcus-server kernel: [33687.436315] RBP: ade8d136fd20 
R08:  R09: 
  Sep 13 04:17:55 marcus-server kernel: [33687.436316] R10: 0284 
R11:  R12: ade8c12b00bc
  Sep 13 04:17:55 marcus-server kernel: [33687.436317] R13: 000d 
R14: 96720a46 R15: 96720a460870
  Sep 13 04:17:55 marcus-server kernel: [33687.436318] FS:  
() GS:96791ea8() knlGS:
  Sep 13 04:17:55 marcus-server kernel: [33687.436319] CS:  0010 DS:  ES: 
 CR0: 80050033
  Sep 13 04:17:55 marcus-server kernel: [33687.436320] CR2:  
CR3: 0007c8c1 CR4: 00350ee0

  
  It happens pretty often too, but the system isn't overloaded, so I'm not sure 
what is causing it. 

  Message from syslogd@marcus-server at Sep 14 02:16:11 ...
   kernel:[ 1276.914096] watchdog: BUG: soft lockup - CPU#8 stuck for 26s! 
[kworker/u64:28:252938]

  Message from syslogd@marcus-server at Sep 14 02:16:11 ...
   kernel:[ 1304.913956] watchdog: BUG: soft lockup - CPU#8 stuck for 52s! 
[kworker/u64:28:252938]

  Message from syslogd@marcus-server at Sep 14 02:37:47