Public bug reported:

=== SUMMARY ===

A userspace process reading /proc/net/tcp triggers a softlockup in the TCP
listening hash table traversal (tcp_seq_next -> listening_get_first ->
_raw_spin_lock), stalling CPU#4 for 210+ seconds and cascading into a full
networking stack deadlock via mutex chains through kworker -> wpa_supplicant
-> NetworkManager -> tailscaled and others. System requires hard reset.

=== ENVIRONMENT ==

Kernel:        6.17.0-20-generic #20-Ubuntu SMP PREEMPT(voluntary)
Distribution:  Ubuntu 25.10 (questing)
Compiler:      gcc 15.2.0 (Ubuntu 15.2.0-4ubuntu4)
Architecture:  x86_64
Hardware:      ASUS ROG STRIX Z890-E GAMING WIFI, BIOS 2201 09/12/2025
CPUs:          20 (Intel)
Taint:         P (PROPRIETARY_MODULE), O (OOT_MODULE), L (SOFTLOCKUP)

=== DESCRIPTION ===

On April 10 2026 at ~22:03 local time, the system became completely
unresponsive. Two CPU cores (4 and 1) entered softlockup state spinning on
the TCP listening hash table spinlock via the seq_file read path for
/proc/net/tcp.

The lockup started at 22:03:29 and persisted until manual reboot at ~22:46.
CPU#4 was stuck for 210+ seconds (confirmed by RCU stall at 22:07:07
reporting cputime of 210003ms). CPU#1 was also stuck for 75+ seconds.

This triggered a cascading mutex dependency chain that blocked the entire
networking stack:

  node (PID 495127) reads /proc/net/tcp
    -> kworker/u80:5 (485336) blocked on mutex owned by node
      -> wpa_supplicant (2546) blocked on mutex owned by kworker
        -> NetworkManager (2540) blocked on mutex owned by wpa_supplicant
        -> tailscaled (2891) blocked on mutex owned by wpa_supplicant
        -> nxserver.bin (3077) blocked on mutex owned by wpa_supplicant
        -> connector-threa (5600) blocked on mutex owned by wpa_supplicant
        -> P2P_DISCOVER (7304) blocked on mutex owned by wpa_supplicant
        -> ThreadPoolForeg (5808) blocked on mutex owned by wpa_supplicant
        -> kworker/10:2 (43769) blocked on mutex owned by node

10+ tasks total blocked for 122+ seconds before hung task warnings were
suppressed.

=== CALL TRACE (softlockup on CPU#4) ===

watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [DefaultDispatch:21259]
Tainted: P O L 6.17.0-20-generic #20-Ubuntu PREEMPT(voluntary)

Call Trace:
 <TASK>
 _raw_spin_lock+0x3f/0x60
 listening_get_first+0x90/0x120
 listening_get_next+0xb0/0xd0
 tcp_seq_next+0x60/0x90
 seq_read_iter+0x2f9/0x490
 seq_read+0x11b/0x160
 proc_reg_read+0x6a/0xd0
 vfs_read+0xbc/0x3a0
 ksys_read+0x71/0xf0
 __x64_sys_read+0x19/0x30
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
 </TASK>

RIP: 0010:native_queued_spin_lock_slowpath+0x24e/0x330
RSP: 0018:ffffcd1602b978f0 EFLAGS: 00000206
RAX: 0000000000082056 RBX: ffff8cc5440be010 RCX: ffff8cc5440be010

=== RCU STALL (CPU#4) ===

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu:   4-....: (240003 ticks this GP) idle=c7b4/1/0x4000000000000000 
softirq=7014058/7014058 fqs=58549
rcu:            hardirqs   softirqs   csw/system
rcu:   number:   243545        331            0
rcu:   cputime:        0          0       210003   ==> 210003(ms)
rcu:   (t=240005 jiffies g=10296721 q=126115 ncpus=20)

=== HUNG TASKS (all blocked 122+ seconds) ===

INFO: task NetworkManager:2540 blocked for more than 122 seconds.
INFO: task NetworkManager:2540 is blocked on a mutex likely owned by task 
wpa_supplicant:2546.

INFO: task wpa_supplicant:2546 blocked for more than 122 seconds.
INFO: task wpa_supplicant:2546 is blocked on a mutex likely owned by task 
kworker/u80:5:485336.

INFO: task tailscaled:2891 blocked for more than 122 seconds.
INFO: task tailscaled:2891 is blocked on a mutex likely owned by task 
wpa_supplicant:2546.

INFO: task nxserver.bin:3077 blocked for more than 122 seconds.

INFO: task connector-threa:5600 blocked for more than 122 seconds.
INFO: task connector-threa:5600 is blocked on a mutex likely owned by task 
wpa_supplicant:2546.

INFO: task P2P_DISCOVER:7304 blocked for more than 122 seconds.
INFO: task P2P_DISCOVER:7304 is blocked on a mutex likely owned by task 
wpa_supplicant:2546.

INFO: task ThreadPoolForeg:5808 blocked for more than 122 seconds.
INFO: task ThreadPoolForeg:5808 is blocked on a mutex likely owned by task 
wpa_supplicant:2546.

INFO: task kworker/10:2:43769 blocked for more than 122 seconds.
INFO: task kworker/10:2:43769 is blocked on a mutex likely owned by task 
(node):495127.

INFO: task kworker/u80:5:485336 blocked for more than 122 seconds.

INFO: task (node):495127 blocked for more than 122 seconds.

Future hung task reports are suppressed (kernel.hung_task_warnings=10).

=== SOFTLOCKUP TIMELINE ===

22:03:29  CPU#4 stuck for 23s  [DefaultDispatch:21259]
22:03:57  CPU#4 stuck for 49s
22:04:07  RCU stall detected on CPU#4 (cputime 29999ms)
22:04:13  CPU#1 stuck for 23s  [DefaultDispatch:7886]
22:04:33  CPU#4 stuck for 82s
22:04:41  CPU#1 stuck for 49s
22:05:01  CPU#4 stuck for 108s
22:05:09  CPU#1 stuck for 75s
22:05:29  CPU#4 stuck for 134s
22:05:37  CPU#1 stuck for 108s
22:05:57  CPU#4 stuck for 160s
22:06:05  CPU#1 stuck for 134s
22:06:25  CPU#4 stuck for 186s
22:06:29  Hung task reports (10+ tasks blocked 122+ seconds)
22:07:07  RCU stall on CPU#4 (cputime 210003ms, 240s in grace period)
...system unresponsive until manual reboot at ~22:46

=== ROOT CAUSE ANALYSIS ===

The seq_file implementation for /proc/net/tcp walks the TCP listening hash
table (listening_hash) under a spinlock. Under heavy network connection load
with concurrent modifications, this spinlock becomes contended. With
PREEMPT(voluntary), the spinning CPU cannot be preempted, causing softlockup.

The softlockup on the kworker then cascades through mutex dependencies in the
networking subsystem (wpa_supplicant -> NetworkManager -> etc.), creating a
system-wide deadlock of all networking-related processes.

This appears to be a known class of bug in the TCP hash table seq_file
implementation that has been partially addressed in mainline via RCU-based
read-side access. The listening_hash walk may not be fully converted.

=== WORKAROUND ===

Setting softlockup_panic=1 allows the kernel to auto-reboot after ~10s of
softlockup instead of hanging indefinitely.

=== REPRODUCIBILITY ===

Happened once. The trigger was a node process (PID 495127) reading
/proc/net/tcp while the system was under moderate network load. Not
deterministically reproducible, but the code path is clear from traces.

=== KERNEL CONFIG ===

CONFIG_PREEMPT_VOLUNTARY=y (PREEMPT(voluntary))
Kernel command line: root=zfs:rpool/root loglevel=7 spl.spl_hostid=0x00bab10c

ProblemType: Bug
DistroRelease: Ubuntu 25.10
Package: linux-image-6.17.0-20-generic 6.17.0-20.20
ProcVersionSignature: Ubuntu 6.17.0-20.20-generic 6.17.13
Uname: Linux 6.17.0-20-generic x86_64
NonfreeKernelModules: zfs
ApportVersion: 2.33.1-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER        PID ACCESS COMMAND
 /dev/snd/controlC1:  rperez     3496 F.... wireplumber
 /dev/snd/controlC2:  rperez     3496 F.... wireplumber
 /dev/snd/controlC0:  rperez     3496 F.... wireplumber
 /dev/snd/seq:        rperez     3492 F.... pipewire
CasperMD5CheckResult: unknown
CurrentDesktop: KDE
Date: Fri Apr 10 23:20:18 2026
MachineType: ASUS System Product Name
ProcFB: 0 nvidia-drmdrmfb
ProcKernelCmdLine: root=zfs:rpool/root loglevel=7 spl.spl_hostid=0x00bab10c
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 firmware-sof   N/A
 linux-firmware 20250901.git993ff19b-0ubuntu1.10
SourcePackage: linux
UpgradeStatus: Upgraded to questing on 2025-12-13 (118 days ago)
dmi.bios.date: 09/12/2025
dmi.bios.release: 22.1
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2201
dmi.board.asset.tag: Default string
dmi.board.name: ROG STRIX Z890-E GAMING WIFI
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr2201:bd09/12/2025:br22.1:svnASUS:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnROGSTRIXZ890-EGAMINGWIFI:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:skuSKU:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: ASUS

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug questing

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2148046

Title:
  softlockup in tcp_seq_next / listening_get_first causes full
  networking deadlock

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2148046/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to