Hi, I have a reproducible scenario wherein removing a USB device while reading /sys/kernel/debug/usb/devices causes a deadlock. This should not be specific to any USB device. Any USB device removal that causes a call to debugfs_remove() has inverted lock ordering with respect to the read() of debug/usb/devices.
e.g. read thread: srcu_read_lock(&debugfs_srcu); -- usb unplug -- remove thread: mutex_lock(&usb_bus_idr_lock); remove thread: synchronize_srcu(&debugfs_srcu); <- blocked read thread: mutex_lock(&usb_bus_idr_lock); <- blocked read thread: srcu_read_unlock(&debugfs_srcu, ...); This seems to be another flavor of what Johannes Berg reported: deadlock in synchronize_srcu() in debugfs? https://lkml.org/lkml/2017/3/23/415 I applied this patch set from Nicolai Stange and can no longer reproduce the hang. [RFC PATCH v2 0/9] debugfs: per-file removal protection https://lkml.org/lkml/2017/5/3/292 As patch 2/9 in the series indicates, commit 49d200deaa68 ("debugfs: prevent access to removed files' private data") is where this was first introduced, and it is reproducible on v4.14-rc4. How should we move forward with the resolution of this debugfs change? It seems to me that the USB locking is reasonable but the debugfs global srcu is overly restrictive. This could lead to unexpected lock inversion any time a driver shares a mutex between its debugfs read and removal paths. Backtrace below. Thanks! -Tyler Hall This is easier to reproduce by adding a sleep before the usb_bus_idr_lock, but I've seen it on an unmodified kernel. diff --git a/drivers/usb/core/devices.c b/drivers/usb/core/devices.c index 55dea2e7828f..534650cd0950 100644 --- a/drivers/usb/core/devices.c +++ b/drivers/usb/core/devices.c @@ -614,6 +614,7 @@ static ssize_t usb_device_read(struct file *file, char __user *buf, if (!access_ok(VERIFY_WRITE, buf, nbytes)) return -EFAULT; + msleep(1000); mutex_lock(&usb_bus_idr_lock); /* print devices for all busses */ idr_for_each_entry(&usb_bus_idr, bus, id) { [ 24.240542] sysrq: SysRq : Show Blocked State [ 24.240765] task PC stack pid father [ 24.240975] kworker/0:2 D13840 881 2 0x80000000 [ 24.241525] Workqueue: usb_hub_wq hub_event [ 24.241682] Call Trace: [ 24.242273] __schedule+0x317/0x6d0 [ 24.242442] schedule+0x31/0x80 [ 24.242514] schedule_timeout+0x1d0/0x320 [ 24.242603] ? __queue_work+0x135/0x400 [ 24.242689] wait_for_completion+0x92/0xf0 [ 24.242765] ? wait_for_completion+0x92/0xf0 [ 24.242841] ? wake_up_q+0x70/0x70 [ 24.242907] __synchronize_srcu.part.14+0x71/0x90 [ 24.242985] ? trace_event_raw_event_rcu_torture_read+0xe0/0xe0 [ 24.243169] synchronize_srcu_expedited+0x22/0x30 [ 24.243265] ? synchronize_srcu_expedited+0x22/0x30 [ 24.243347] synchronize_srcu+0x9a/0xc0 [ 24.243418] debugfs_remove+0x6d/0xa0 [ 24.243490] bdi_unregister+0x8b/0x170 [ 24.243558] del_gendisk+0x139/0x220 [ 24.243624] sd_remove+0x5c/0xc0 [ 24.243685] device_release_driver_internal+0x150/0x210 [ 24.243769] device_release_driver+0xd/0x10 [ 24.243841] bus_remove_device+0xdb/0x120 [ 24.243915] device_del+0x1c3/0x2e0 [ 24.243977] __scsi_remove_device+0xff/0x130 [ 24.244122] scsi_forget_host+0x5b/0x60 [ 24.244203] scsi_remove_host+0x74/0x140 [ 24.244281] usb_stor_disconnect+0x54/0xc0 [ 24.244357] usb_unbind_interface+0x6d/0x260 [ 24.244437] device_release_driver_internal+0x150/0x210 [ 24.244520] device_release_driver+0xd/0x10 [ 24.244591] bus_remove_device+0xdb/0x120 [ 24.244659] device_del+0x1c3/0x2e0 [ 24.244722] usb_disable_device+0x97/0x1f0 [ 24.244792] usb_disconnect+0x88/0x230 [ 24.244853] hub_event+0x5b9/0x11e0 [ 24.244915] ? add_timer+0x10e/0x230 [ 24.244984] process_one_work+0x146/0x3e0 [ 24.245124] worker_thread+0x43/0x3e0 [ 24.245204] kthread+0x104/0x140 [ 24.245266] ? create_worker+0x190/0x190 [ 24.245333] ? kthread_create_on_node+0x40/0x40 [ 24.245406] ret_from_fork+0x22/0x30 [ 24.245542] cat D13712 1029 1018 0x00000000 [ 24.245652] Call Trace: [ 24.245705] __schedule+0x317/0x6d0 [ 24.245770] schedule+0x31/0x80 [ 24.245830] schedule_preempt_disabled+0x9/0x10 [ 24.245903] __mutex_lock.isra.2+0x225/0x470 [ 24.245975] __mutex_lock_slowpath+0xe/0x10 [ 24.246110] ? __mutex_lock_slowpath+0xe/0x10 [ 24.246199] mutex_lock+0x2a/0x30 [ 24.246261] usb_device_read+0xb6/0x140 [ 24.246325] full_proxy_read+0x4f/0x90 [ 24.246394] __vfs_read+0x23/0x120 [ 24.246456] ? security_file_permission+0x96/0xb0 [ 24.246533] ? rw_verify_area+0x49/0xb0 [ 24.246593] vfs_read+0x8e/0x130 [ 24.246646] SyS_read+0x41/0xa0 [ 24.246698] entry_SYSCALL_64_fastpath+0x13/0x94