There's a potential deadlock with the following cycle: wfs_lock --> device_links_lock --> kn->count
Fix this by simply dropping the lock around a list_empty() check that's just exported to a sysfs file. The sysfs file output is an instantaneous check anyway and the lock doesn't really add any protection. Lockdep log: [ 48.808132] [ 48.808132] the existing dependency chain (in reverse order) is: [ 48.809069] [ 48.809069] -> #2 (kn->count){++++}: [ 48.809707] __kernfs_remove.llvm.7860393000964815146+0x2d4/0x460 [ 48.810537] kernfs_remove_by_name_ns+0x54/0x9c [ 48.811171] sysfs_remove_file_ns+0x18/0x24 [ 48.811762] device_del+0x2b8/0x5a8 [ 48.812269] __device_link_del+0x98/0xb8 [ 48.812829] device_links_driver_bound+0x210/0x2d8 [ 48.813496] driver_bound+0x44/0xf8 [ 48.814000] really_probe+0x340/0x6e0 [ 48.814526] driver_probe_device+0xb8/0x100 [ 48.815117] device_driver_attach+0x78/0xb8 [ 48.815708] __driver_attach+0xe0/0x194 [ 48.816255] bus_for_each_dev+0xa8/0x11c [ 48.816816] driver_attach+0x24/0x30 [ 48.817331] bus_add_driver+0x100/0x1e0 [ 48.817880] driver_register+0x78/0x114 [ 48.818427] __platform_driver_register+0x44/0x50 [ 48.819089] 0xffffffdbb3227038 [ 48.819551] do_one_initcall+0xd8/0x1e0 [ 48.820099] do_init_module+0xd8/0x298 [ 48.820636] load_module+0x3afc/0x44c8 [ 48.821173] __arm64_sys_finit_module+0xbc/0xf0 [ 48.821807] el0_svc_common+0xbc/0x1d0 [ 48.822344] el0_svc_handler+0x74/0x98 [ 48.822882] el0_svc+0x8/0xc [ 48.823310] [ 48.823310] -> #1 (device_links_lock){+.+.}: [ 48.824036] __mutex_lock_common+0xe0/0xe44 [ 48.824626] mutex_lock_nested+0x28/0x34 [ 48.825185] device_link_add+0xd4/0x4ec [ 48.825734] of_link_to_suppliers+0x158/0x204 [ 48.826347] of_fwnode_add_links+0x50/0x64 [ 48.826928] device_link_add_missing_supplier_links+0x90/0x11c [ 48.827725] fw_devlink_resume+0x58/0x130 [ 48.828296] of_platform_default_populate_init+0xb4/0xd0 [ 48.829030] do_one_initcall+0xd8/0x1e0 [ 48.829578] do_initcall_level+0xb8/0xcc [ 48.830137] do_basic_setup+0x60/0x7c [ 48.830662] kernel_init_freeable+0x128/0x1ac [ 48.831275] kernel_init+0x18/0x29c [ 48.831781] ret_from_fork+0x10/0x18 [ 48.832297] [ 48.832297] -> #0 (wfs_lock){+.+.}: [ 48.832922] __lock_acquire+0xe04/0x2e20 [ 48.833480] lock_acquire+0xbc/0xec [ 48.833984] __mutex_lock_common+0xe0/0xe44 [ 48.834577] mutex_lock_nested+0x28/0x34 [ 48.835136] waiting_for_supplier_show+0x3c/0x98 [ 48.835781] dev_attr_show+0x48/0xb4 [ 48.836295] sysfs_kf_seq_show+0xe8/0x184 [ 48.836864] kernfs_seq_show+0x48/0x8c [ 48.837401] seq_read+0x1c8/0x600 [ 48.837884] kernfs_fop_read+0x68/0x204 [ 48.838431] __vfs_read+0x60/0x214 [ 48.838925] vfs_read+0xbc/0x15c [ 48.839397] ksys_read+0x78/0xe4 [ 48.839869] __arm64_sys_read+0x1c/0x28 [ 48.840416] el0_svc_common+0xbc/0x1d0 [ 48.840953] el0_svc_handler+0x74/0x98 [ 48.841490] el0_svc+0x8/0xc [ 48.841917] [ 48.841917] other info that might help us debug this: [ 48.841917] [ 48.842920] Chain exists of: [ 48.842920] wfs_lock --> device_links_lock --> kn->count [ 48.842920] [ 48.844152] Possible unsafe locking scenario: [ 48.844152] [ 48.844895] CPU0 CPU1 [ 48.845463] ---- ---- [ 48.846032] lock(kn->count); [ 48.846417] lock(device_links_lock); [ 48.847203] lock(kn->count); [ 48.847902] lock(wfs_lock); [ 48.848276] [ 48.848276] *** DEADLOCK *** Reported-by: cheng-jui.w...@mediatek.com Signed-off-by: Saravana Kannan <sarava...@google.com> --- drivers/base/core.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 78114ddac755..df35b3206286 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1064,10 +1064,8 @@ static ssize_t waiting_for_supplier_show(struct device *dev, bool val; device_lock(dev); - mutex_lock(&wfs_lock); val = !list_empty(&dev->links.needs_suppliers) && dev->links.need_for_probe; - mutex_unlock(&wfs_lock); device_unlock(dev); return sysfs_emit(buf, "%u\n", val); } -- 2.29.1.341.ge80a0c044ae-goog