Re: [PATCH] workqueue: allow work_on_cpu() to be called recursively
On 07/24/2013 09:55 PM, Tejun Heo wrote: > Applied to wq/for-3.11-fixes with comment and subject tweaks. > > Thanks! > > -- 8< > > From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001 > From: Lai Jiangshan > Date: Wed, 24 Jul 2013 18:31:42 +0800 > > If the @fn call work_on_cpu() again, the lockdep will complain: > >> [ INFO: possible recursive locking detected ] >> 3.11.0-rc1-lockdep-fix-a #6 Not tainted >> - >> kworker/0:1/142 is trying to acquire lock: >> (()){+.+.+.}, at: [] flush_work+0x0/0xb0 >> >> but task is already holding lock: >> (()){+.+.+.}, at: [] process_one_work+0x169/0x610 >> >> other info that might help us debug this: >> Possible unsafe locking scenario: >> >>CPU0 >> >> lock(()); >> lock(()); >> >> *** DEADLOCK *** > > It is false-positive lockdep report. In this sutiation, > the two "wfc"s of the two work_on_cpu() are different, > they are both on stack. flush_work() can't be deadlock. > > To fix this, we need to avoid the lockdep checking in this case, > thus we instroduce a internal __flush_work() which skip the lockdep. > > tj: Minor comment adjustment. > > Signed-off-by: Lai Jiangshan > Reported-by: "Srivatsa S. Bhat" > Reported-by: Alexander Duyck > Signed-off-by: Tejun Heo > --- This version works as well, it fixes the issue I was facing. Thank you! FWIW: Tested-by: Srivatsa S. Bhat Regards, Srivatsa S. Bhat > kernel/workqueue.c | 32 ++-- > 1 file changed, 22 insertions(+), 10 deletions(-) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index f02c4a4..55f5f0a 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -2817,6 +2817,19 @@ already_gone: > return false; > } > > +static bool __flush_work(struct work_struct *work) > +{ > + struct wq_barrier barr; > + > + if (start_flush_work(work, )) { > + wait_for_completion(); > + destroy_work_on_stack(); > + return true; > + } else { > + return false; > + } > +} > + > /** > * flush_work - wait for a work to finish executing the last queueing > instance > * @work: the work to flush > @@ -2830,18 +2843,10 @@ already_gone: > */ > bool flush_work(struct work_struct *work) > { > - struct wq_barrier barr; > - > lock_map_acquire(>lockdep_map); > lock_map_release(>lockdep_map); > > - if (start_flush_work(work, )) { > - wait_for_completion(); > - destroy_work_on_stack(); > - return true; > - } else { > - return false; > - } > + return __flush_work(work); > } > EXPORT_SYMBOL_GPL(flush_work); > > @@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void > *arg) > > INIT_WORK_ONSTACK(, work_for_cpu_fn); > schedule_work_on(cpu, ); > - flush_work(); > + > + /* > + * The work item is on-stack and can't lead to deadlock through > + * flushing. Use __flush_work() to avoid spurious lockdep warnings > + * when work_on_cpu()s are nested. > + */ > + __flush_work(); > + > return wfc.ret; > } > EXPORT_SYMBOL_GPL(work_on_cpu); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] workqueue: allow work_on_cpu() to be called recursively
On 07/24/2013 09:55 PM, Tejun Heo wrote: Applied to wq/for-3.11-fixes with comment and subject tweaks. Thanks! -- 8 From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001 From: Lai Jiangshan la...@cn.fujitsu.com Date: Wed, 24 Jul 2013 18:31:42 +0800 If the @fn call work_on_cpu() again, the lockdep will complain: [ INFO: possible recursive locking detected ] 3.11.0-rc1-lockdep-fix-a #6 Not tainted - kworker/0:1/142 is trying to acquire lock: ((wfc.work)){+.+.+.}, at: [81077100] flush_work+0x0/0xb0 but task is already holding lock: ((wfc.work)){+.+.+.}, at: [81075dd9] process_one_work+0x169/0x610 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock((wfc.work)); lock((wfc.work)); *** DEADLOCK *** It is false-positive lockdep report. In this sutiation, the two wfcs of the two work_on_cpu() are different, they are both on stack. flush_work() can't be deadlock. To fix this, we need to avoid the lockdep checking in this case, thus we instroduce a internal __flush_work() which skip the lockdep. tj: Minor comment adjustment. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Reported-by: Alexander Duyck alexander.h.du...@intel.com Signed-off-by: Tejun Heo t...@kernel.org --- This version works as well, it fixes the issue I was facing. Thank you! FWIW: Tested-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Regards, Srivatsa S. Bhat kernel/workqueue.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index f02c4a4..55f5f0a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2817,6 +2817,19 @@ already_gone: return false; } +static bool __flush_work(struct work_struct *work) +{ + struct wq_barrier barr; + + if (start_flush_work(work, barr)) { + wait_for_completion(barr.done); + destroy_work_on_stack(barr.work); + return true; + } else { + return false; + } +} + /** * flush_work - wait for a work to finish executing the last queueing instance * @work: the work to flush @@ -2830,18 +2843,10 @@ already_gone: */ bool flush_work(struct work_struct *work) { - struct wq_barrier barr; - lock_map_acquire(work-lockdep_map); lock_map_release(work-lockdep_map); - if (start_flush_work(work, barr)) { - wait_for_completion(barr.done); - destroy_work_on_stack(barr.work); - return true; - } else { - return false; - } + return __flush_work(work); } EXPORT_SYMBOL_GPL(flush_work); @@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg) INIT_WORK_ONSTACK(wfc.work, work_for_cpu_fn); schedule_work_on(cpu, wfc.work); - flush_work(wfc.work); + + /* + * The work item is on-stack and can't lead to deadlock through + * flushing. Use __flush_work() to avoid spurious lockdep warnings + * when work_on_cpu()s are nested. + */ + __flush_work(wfc.work); + return wfc.ret; } EXPORT_SYMBOL_GPL(work_on_cpu); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] workqueue: allow work_on_cpu() to be called recursively
Applied to wq/for-3.11-fixes with comment and subject tweaks. Thanks! -- 8< >From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001 From: Lai Jiangshan Date: Wed, 24 Jul 2013 18:31:42 +0800 If the @fn call work_on_cpu() again, the lockdep will complain: > [ INFO: possible recursive locking detected ] > 3.11.0-rc1-lockdep-fix-a #6 Not tainted > - > kworker/0:1/142 is trying to acquire lock: > (()){+.+.+.}, at: [] flush_work+0x0/0xb0 > > but task is already holding lock: > (()){+.+.+.}, at: [] process_one_work+0x169/0x610 > > other info that might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(()); > lock(()); > > *** DEADLOCK *** It is false-positive lockdep report. In this sutiation, the two "wfc"s of the two work_on_cpu() are different, they are both on stack. flush_work() can't be deadlock. To fix this, we need to avoid the lockdep checking in this case, thus we instroduce a internal __flush_work() which skip the lockdep. tj: Minor comment adjustment. Signed-off-by: Lai Jiangshan Reported-by: "Srivatsa S. Bhat" Reported-by: Alexander Duyck Signed-off-by: Tejun Heo --- kernel/workqueue.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index f02c4a4..55f5f0a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2817,6 +2817,19 @@ already_gone: return false; } +static bool __flush_work(struct work_struct *work) +{ + struct wq_barrier barr; + + if (start_flush_work(work, )) { + wait_for_completion(); + destroy_work_on_stack(); + return true; + } else { + return false; + } +} + /** * flush_work - wait for a work to finish executing the last queueing instance * @work: the work to flush @@ -2830,18 +2843,10 @@ already_gone: */ bool flush_work(struct work_struct *work) { - struct wq_barrier barr; - lock_map_acquire(>lockdep_map); lock_map_release(>lockdep_map); - if (start_flush_work(work, )) { - wait_for_completion(); - destroy_work_on_stack(); - return true; - } else { - return false; - } + return __flush_work(work); } EXPORT_SYMBOL_GPL(flush_work); @@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg) INIT_WORK_ONSTACK(, work_for_cpu_fn); schedule_work_on(cpu, ); - flush_work(); + + /* +* The work item is on-stack and can't lead to deadlock through +* flushing. Use __flush_work() to avoid spurious lockdep warnings +* when work_on_cpu()s are nested. +*/ + __flush_work(); + return wfc.ret; } EXPORT_SYMBOL_GPL(work_on_cpu); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] workqueue: allow work_on_cpu() to be called recursively
Applied to wq/for-3.11-fixes with comment and subject tweaks. Thanks! -- 8 From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001 From: Lai Jiangshan la...@cn.fujitsu.com Date: Wed, 24 Jul 2013 18:31:42 +0800 If the @fn call work_on_cpu() again, the lockdep will complain: [ INFO: possible recursive locking detected ] 3.11.0-rc1-lockdep-fix-a #6 Not tainted - kworker/0:1/142 is trying to acquire lock: ((wfc.work)){+.+.+.}, at: [81077100] flush_work+0x0/0xb0 but task is already holding lock: ((wfc.work)){+.+.+.}, at: [81075dd9] process_one_work+0x169/0x610 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock((wfc.work)); lock((wfc.work)); *** DEADLOCK *** It is false-positive lockdep report. In this sutiation, the two wfcs of the two work_on_cpu() are different, they are both on stack. flush_work() can't be deadlock. To fix this, we need to avoid the lockdep checking in this case, thus we instroduce a internal __flush_work() which skip the lockdep. tj: Minor comment adjustment. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Reported-by: Alexander Duyck alexander.h.du...@intel.com Signed-off-by: Tejun Heo t...@kernel.org --- kernel/workqueue.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index f02c4a4..55f5f0a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2817,6 +2817,19 @@ already_gone: return false; } +static bool __flush_work(struct work_struct *work) +{ + struct wq_barrier barr; + + if (start_flush_work(work, barr)) { + wait_for_completion(barr.done); + destroy_work_on_stack(barr.work); + return true; + } else { + return false; + } +} + /** * flush_work - wait for a work to finish executing the last queueing instance * @work: the work to flush @@ -2830,18 +2843,10 @@ already_gone: */ bool flush_work(struct work_struct *work) { - struct wq_barrier barr; - lock_map_acquire(work-lockdep_map); lock_map_release(work-lockdep_map); - if (start_flush_work(work, barr)) { - wait_for_completion(barr.done); - destroy_work_on_stack(barr.work); - return true; - } else { - return false; - } + return __flush_work(work); } EXPORT_SYMBOL_GPL(flush_work); @@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg) INIT_WORK_ONSTACK(wfc.work, work_for_cpu_fn); schedule_work_on(cpu, wfc.work); - flush_work(wfc.work); + + /* +* The work item is on-stack and can't lead to deadlock through +* flushing. Use __flush_work() to avoid spurious lockdep warnings +* when work_on_cpu()s are nested. +*/ + __flush_work(wfc.work); + return wfc.ret; } EXPORT_SYMBOL_GPL(work_on_cpu); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/