Public bug reported: Description: Ubuntu 20.04.2 LTS Release: 20.04
Been suddenly seeing a number of crashes today on my threadripper 2950x box today after the system being off over the weekend. Suspect it may be tied to Ubuntu 5.4.0-80.90-generic 5.4.124 kernel, as I wasn't seeing it last week or previously. Aug 2 16:52:14 threadripper kernel: [ 600.168436] watchdog: BUG: soft lockup - CPU#19 stuck for 22s! [kworker/19:0:11301] Aug 2 16:52:14 threadripper kernel: [ 600.168490] Modules linked in: veth xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf _nat br_netfilter bridge stp llc aufs overlay nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi eeepc_wmi snd_hda_intel edac_mce_amd snd_intel_dspcfg asus_wmi ftdi_sio snd_hda_codec kvm_amd usbserial sparse_keymap snd_ hda_core kvm video wmi_bmof snd_hwdep snd_pcm snd_timer snd ccp soundcore k10temp mac_hid nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common ipt_REJECT nf_reject_ipv4 xt_LOG xt_limit xt_addrtype sch_fq_codel xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table _filter ip6_tables iptable_filter bpfilter ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid uas usb_storage amdgpu Aug 2 16:52:14 threadripper kernel: [ 600.168542] amd_iommu_v2 gpu_sched crct10dif_pclmul ttm crc32_pclmul ghash_clmulni_intel drm_kms_helper syscopyare a aesni_intel crypto_simd mxm_wmi sysfillrect cryptd sysimgblt glue_helper fb_sys_fops igb drm dca i2c_piix4 ahci i2c_algo_bit libahci gpio_amdpt wmi gpio_ generic Aug 2 16:52:14 threadripper kernel: [ 600.168558] CPU: 19 PID: 11301 Comm: kworker/19:0 Tainted: G L 5.4.0-80-generic #90-Ubuntu Aug 2 16:52:14 threadripper kernel: [ 600.168559] Hardware name: System manufacturer System Product Name/ROG STRIX X399-E GAMING, BIOS 1203 10/09/2019 Aug 2 16:52:14 threadripper kernel: [ 600.168569] Workqueue: events free_work Aug 2 16:52:14 threadripper kernel: [ 600.168574] RIP: 0010:smp_call_function_many+0x205/0x270 Aug 2 16:52:14 threadripper kernel: [ 600.168576] Code: e8 50 10 92 00 3b 05 ae cf 70 01 89 c7 0f 83 9b fe ff ff 48 63 c7 48 8b 0b 48 03 0c c5 80 99 64 a 1 8b 41 18 a8 01 74 0a f3 90 <8b> 51 18 83 e2 01 75 f6 eb c8 89 cf 48 c7 c2 a0 b8 a4 a1 4c 89 fe Aug 2 16:52:14 threadripper kernel: [ 600.168577] RSP: 0018:ffffb66b0aa17d00 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 Aug 2 16:52:14 threadripper kernel: [ 600.168579] RAX: 0000000000000003 RBX: ffff8de1fd4ebd40 RCX: ffff8de1fd0b2540 Aug 2 16:52:14 threadripper kernel: [ 600.168580] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000002 Aug 2 16:52:14 threadripper kernel: [ 600.168580] RBP: ffffb66b0aa17d40 R08: ffff8de1f6da7190 R09: 0000000000000003 Aug 2 16:52:14 threadripper kernel: [ 600.168581] R10: ffff8de1f6da7190 R11: 0000000000000002 R12: ffffffffa0281930 Aug 2 16:52:14 threadripper kernel: [ 600.168581] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000080 Aug 2 16:52:14 threadripper kernel: [ 600.168583] FS: 0000000000000000(0000) GS:ffff8de1fd4c0000(0000) knlGS:0000000000000000 Aug 2 16:52:14 threadripper kernel: [ 600.168583] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 2 16:52:14 threadripper kernel: [ 600.168584] CR2: 000055ea29edefd0 CR3: 00000009c500a000 CR4: 00000000003406e0 Aug 2 16:52:14 threadripper kernel: [ 600.168585] Call Trace: Aug 2 16:52:14 threadripper kernel: [ 600.168592] ? load_new_mm_cr3+0xf0/0xf0 Aug 2 16:52:14 threadripper kernel: [ 600.168594] on_each_cpu+0x2d/0x60 Aug 2 16:52:14 threadripper kernel: [ 600.168596] flush_tlb_kernel_range+0x38/0x90 Aug 2 16:52:14 threadripper kernel: [ 600.168597] __purge_vmap_area_lazy+0x70/0x6d0 Aug 2 16:52:14 threadripper kernel: [ 600.168598] free_vmap_area_noflush+0xe1/0xf0 Aug 2 16:52:14 threadripper kernel: [ 600.168600] remove_vm_area+0x9a/0xb0 Aug 2 16:52:14 threadripper kernel: [ 600.168602] __vunmap+0x5f/0x210 Aug 2 16:52:14 threadripper kernel: [ 600.168603] free_work+0x25/0x30 Aug 2 16:52:14 threadripper kernel: [ 600.168607] process_one_work+0x1eb/0x3b0 Aug 2 16:52:14 threadripper kernel: [ 600.168609] worker_thread+0x4d/0x400 Aug 2 16:52:14 threadripper kernel: [ 600.168611] kthread+0x104/0x140 Aug 2 16:52:14 threadripper kernel: [ 600.168612] ? process_one_work+0x3b0/0x3b0 Aug 2 16:52:14 threadripper kernel: [ 600.168613] ? kthread_park+0x90/0x90 Aug 2 16:52:14 threadripper kernel: [ 600.168617] ret_from_fork+0x22/0x40 Aug 2 16:52:40 threadripper kernel: [ 606.280524] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: Aug 2 16:52:40 threadripper kernel: [ 606.280567] rcu: 2-...0: (1 GPs behind) idle=ae6/1/0x4000000000000000 softirq=26910/26911 fqs=7179 Aug 2 16:52:40 threadripper kernel: [ 606.280609] rcu: 18-...0: (1 GPs behind) idle=c8e/1/0x4000000000000000 softirq=28056/28057 fqs=7179 Aug 2 16:52:40 threadripper kernel: [ 606.280659] (detected by 24, t=15002 jiffies, g=39017, q=5149545) Aug 2 16:52:40 threadripper kernel: [ 606.280661] Sending NMI from CPU 24 to CPUs 2: Aug 2 16:52:40 threadripper kernel: [ 616.204803] Sending NMI from CPU 24 to CPUs 18: Aug 2 16:52:40 threadripper kernel: [ 626.131497] rcu: rcu_sched kthread starved for 4960 jiffies! g39017 f0x2 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=7 Aug 2 16:52:40 threadripper kernel: [ 626.131554] rcu: RCU grace-period kthread stack dump: Aug 2 16:52:40 threadripper kernel: [ 626.131577] rcu_sched R running task 0 11 2 0x80004000 Aug 2 16:52:40 threadripper kernel: [ 626.131580] Call Trace: Aug 2 16:52:40 threadripper kernel: [ 626.131589] __schedule+0x2e3/0x740 Aug 2 16:52:40 threadripper kernel: [ 626.131592] preempt_schedule_common+0x18/0x30 Aug 2 16:52:40 threadripper kernel: [ 626.131594] _cond_resched+0x22/0x30 Aug 2 16:52:40 threadripper kernel: [ 626.131597] force_qs_rnp+0xa8/0x170 Aug 2 16:52:40 threadripper kernel: [ 626.131598] ? synchronize_sched_expedited_wait+0x180/0x180 Aug 2 16:52:40 threadripper kernel: [ 626.131600] rcu_gp_kthread+0x5e8/0x990 Aug 2 16:52:40 threadripper kernel: [ 626.131604] kthread+0x104/0x140 Aug 2 16:52:40 threadripper kernel: [ 626.131605] ? kfree_call_rcu+0x20/0x20 Aug 2 16:52:40 threadripper kernel: [ 626.131607] ? kthread_park+0x90/0x90 Aug 2 16:52:40 threadripper kernel: [ 626.131608] ret_from_fork+0x22/0x40 ProblemType: Bug DistroRelease: Ubuntu 20.04 Package: linux-image-5.4.0-80-generic 5.4.0-80.90 ProcVersionSignature: Ubuntu 5.4.0-80.90-generic 5.4.124 Uname: Linux 5.4.0-80-generic x86_64 AlsaVersion: Advanced Linux Sound Architecture Driver Version k5.4.0-80-generic. ApportVersion: 2.20.11-0ubuntu27.18 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D3p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Card0.Amixer.info: Card hw:0 'Generic'/'HD-Audio Generic at 0xba600000 irq 96' Mixer name : 'Realtek ALC1220' Components : 'HDA:10ec1168,10438723,00100003' Controls : 46 Simple ctrls : 20 Card1.Amixer.info: Card hw:1 'HDMI'/'HDA ATI HDMI at 0x9f860000 irq 98' Mixer name : 'ATI R6xx HDMI' Components : 'HDA:1002aa01,00aa0100,00100700' Controls : 14 Simple ctrls : 2 CasperMD5CheckResult: skip Date: Mon Aug 2 19:09:24 2021 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' MachineType: System manufacturer System Product Name ProcEnviron: TERM=screen.xterm-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 amdgpudrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-80-generic root=UUID=04417339-7685-11e9-bdb0-049226da3a81 ro pci=nommconf consoleblank=60 RelatedPackageVersions: linux-restricted-modules-5.4.0-80-generic N/A linux-backports-modules-5.4.0-80-generic N/A linux-firmware 1.187.15 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: Upgraded to focal on 2021-01-23 (191 days ago) dmi.bios.date: 10/09/2019 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 1203 dmi.board.asset.tag: Default string dmi.board.name: ROG STRIX X399-E GAMING dmi.board.vendor: ASUSTeK COMPUTER INC. dmi.board.version: Rev 1.xx dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1203:bd10/09/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGSTRIXX399-EGAMING:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring: dmi.product.family: To be filled by O.E.M. dmi.product.name: System Product Name dmi.product.sku: SKU dmi.product.version: System Version dmi.sys.vendor: System manufacturer ** Affects: linux (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug focal third-party-packages uec-images -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1938722 Title: watchdog: BUG: soft lockup on Threadripper 2950X To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1938722/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs