Re: [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks
Hello, On Fri, Apr 11, 2014 at 12:10:45PM +0800, Li Zhong wrote: > @@ -439,6 +439,7 @@ static ssize_t online_store(struct device *dev, struct > device_attribute *attr, > { > bool val; > int ret; > + struct kernfs_node *kn; > > ret = strtobool(buf, &val); > if (ret < 0) > @@ -448,7 +449,15 @@ static ssize_t online_store(struct device *dev, struct > device_attribute *attr, > if (ret) > return ret; > > + kn = kernfs_find_and_get(dev->kobj.sd, attr->attr.name); Wouldn't find_and_get need to be paired with put? > + if (WARN_ON_ONCE(!kn)) > + goto out; > + > + kernfs_break_active_protection(kn); > + > ret = val ? device_online(dev) : device_offline(dev); With active protection protection @dev may go away at any time. There should be some protection / synchronization to prevent that, no? > + kernfs_unbreak_active_protection(kn); > +out: > unlock_device_hotplug(); > return ret < 0 ? ret : count; > } Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks
I noticed following lockdep warning when trying acpi hot-remove cpus: [84154.204080] == [84154.204080] [ INFO: possible circular locking dependency detected ] [84154.204080] 3.14.0-next-20140408+ #24 Tainted: GW [84154.204080] --- [84154.204080] bash/777 is trying to acquire lock: [84154.204080] (cpu_add_remove_lock){+.+.+.}, at: [] cpu_maps_update_begin+0x17/0x20 [84154.213203] [84154.213203] but task is already holding lock: [84154.213203] (s_active#79){.+}, at: [] kernfs_fop_write+0xe4/0x190 [84154.213203] [84154.213203] which lock already depends on the new lock. [84154.213203] [84154.213203] [84154.213203] the existing dependency chain (in reverse order) is: [84154.213203] -> #3 (s_active#79){.+}: [84154.213203][] lock_acquire+0x9b/0x1d0 [84154.213203][] __kernfs_remove+0x250/0x310 [84154.213203][] kernfs_remove_by_name_ns +0x50/0xc0 [84154.213203][] sysfs_remove_file_ns +0x15/0x20 [84154.213203][] device_remove_file+0x19/0x20 [84154.213203][] device_remove_attrs+0x33/0x80 [84154.213203][] device_del+0x127/0x1c0 [84154.213203][] device_unregister+0x22/0x60 [84154.213203][] unregister_cpu+0x1e/0x40 [84154.213203][] arch_unregister_cpu+0x23/0x30 [84154.213203][] acpi_processor_remove +0x8d/0xb2 [84154.213203][] acpi_bus_trim+0x5a/0x8d [84154.213203][] acpi_device_hotplug +0x1a8/0x3ec [84154.213203][] acpi_hotplug_work_fn +0x1f/0x2b [84154.213203][] process_one_work+0x1eb/0x6b0 [84154.213203][] worker_thread+0x11b/0x370 [84154.213203][] kthread+0xe4/0x100 [84154.213203][] ret_from_fork+0x7c/0xb0 [84154.213203] -> #2 (cpu_hotplug.lock#2){+.+.+.}: [84154.213203][] lock_acquire+0x9b/0x1d0 [84154.213203][] mutex_lock_nested+0x50/0x3c0 [84154.213203][] cpu_hotplug_begin+0x4f/0x80 [84154.213203][] _cpu_up+0x3f/0x160 [84154.213203][] cpu_up+0x69/0x80 [84154.213203][] smp_init+0x60/0x8c [84154.213203][] kernel_init_freeable +0x126/0x23b [84154.213203][] kernel_init+0xe/0xf0 [84154.213203][] ret_from_fork+0x7c/0xb0 [84154.213203] -> #1 (cpu_hotplug.lock){++}: [84154.213203][] lock_acquire+0x9b/0x1d0 [84154.213203][] cpu_hotplug_begin+0x41/0x80 [84154.213203][] _cpu_up+0x3f/0x160 [84154.213203][] cpu_up+0x69/0x80 [84154.213203][] smp_init+0x60/0x8c [84154.213203][] kernel_init_freeable +0x126/0x23b [84154.213203][] kernel_init+0xe/0xf0 [84154.213203][] ret_from_fork+0x7c/0xb0 [84154.213203] -> #0 (cpu_add_remove_lock){+.+.+.}: [84154.213203][] __lock_acquire+0x1f2a/0x1f60 [84154.213203][] lock_acquire+0x9b/0x1d0 [84154.213203][] mutex_lock_nested+0x50/0x3c0 [84154.213203][] cpu_maps_update_begin +0x17/0x20 [84154.213203][] cpu_down+0x1d/0x50 [84154.213203][] cpu_subsys_offline+0x14/0x20 [84154.213203][] device_offline+0xad/0xd0 [84154.213203][] online_store+0x42/0x80 [84154.213203][] dev_attr_store+0x18/0x30 [84154.213203][] sysfs_kf_write+0x49/0x60 [84154.213203][] kernfs_fop_write+0x109/0x190 [84154.213203][] vfs_write+0xbe/0x1c0 [84154.213203][] SyS_write+0x52/0xb0 [84154.213203][] tracesys+0xd0/0xd5 [84154.213203] [84154.213203] other info that might help us debug this: [84154.213203] [84154.213203] Chain exists of: cpu_add_remove_lock --> cpu_hotplug.lock#2 --> s_active#79 [84154.213203] Possible unsafe locking scenario: [84154.213203] [84154.213203]CPU0CPU1 [84154.213203] [84154.213203] lock(s_active#79); [84154.213203]lock(cpu_hotplug.lock#2); [84154.213203]lock(s_active#79); [84154.213203] lock(cpu_add_remove_lock); [84154.213203] [84154.213203] *** DEADLOCK *** . The deadlock itself seems already fixed in commit 5e33bc41. As Tejun suggested, to avoid this lockdep warning, kernfs_break_active_protection() is used before online/offline callbacks to take the s_active lock out of the dependency chain during online/offline operations. Signed-off-by: Li Zhong --- drivers/base/core.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/base/core.c b/drivers/base/core.c index 0dd6528..2b9f68e 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -439,6 +439,7 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr, { bool val; int ret; + struct kernfs_node *kn; ret = strtobool(buf, &val); if (ret < 0) @@ -448,7 +449,15 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr, if (ret) return ret;