Re: [PATCH] workqueue: allow work_on_cpu() to be called recursively

2013-07-27 Thread Srivatsa S. Bhat
On 07/24/2013 09:55 PM, Tejun Heo wrote:
> Applied to wq/for-3.11-fixes with comment and subject tweaks.
> 
> Thanks!
> 
> -- 8< 
> 
> From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan 
> Date: Wed, 24 Jul 2013 18:31:42 +0800
> 
> If the @fn call work_on_cpu() again, the lockdep will complain:
> 
>> [ INFO: possible recursive locking detected ]
>> 3.11.0-rc1-lockdep-fix-a #6 Not tainted
>> -
>> kworker/0:1/142 is trying to acquire lock:
>>  (()){+.+.+.}, at: [] flush_work+0x0/0xb0
>>
>> but task is already holding lock:
>>  (()){+.+.+.}, at: [] process_one_work+0x169/0x610
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>CPU0
>>
>>   lock(());
>>   lock(());
>>
>>  *** DEADLOCK ***
> 
> It is false-positive lockdep report. In this sutiation,
> the two "wfc"s of the two work_on_cpu() are different,
> they are both on stack. flush_work() can't be deadlock.
> 
> To fix this, we need to avoid the lockdep checking in this case,
> thus we instroduce a internal __flush_work() which skip the lockdep.
> 
> tj: Minor comment adjustment.
> 
> Signed-off-by: Lai Jiangshan 
> Reported-by: "Srivatsa S. Bhat" 
> Reported-by: Alexander Duyck 
> Signed-off-by: Tejun Heo 
> ---

This version works as well, it fixes the issue I was facing.
Thank you!

FWIW:
Tested-by: Srivatsa S. Bhat 

Regards,
Srivatsa S. Bhat

>  kernel/workqueue.c | 32 ++--
>  1 file changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index f02c4a4..55f5f0a 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2817,6 +2817,19 @@ already_gone:
>   return false;
>  }
> 
> +static bool __flush_work(struct work_struct *work)
> +{
> + struct wq_barrier barr;
> +
> + if (start_flush_work(work, )) {
> + wait_for_completion();
> + destroy_work_on_stack();
> + return true;
> + } else {
> + return false;
> + }
> +}
> +
>  /**
>   * flush_work - wait for a work to finish executing the last queueing 
> instance
>   * @work: the work to flush
> @@ -2830,18 +2843,10 @@ already_gone:
>   */
>  bool flush_work(struct work_struct *work)
>  {
> - struct wq_barrier barr;
> -
>   lock_map_acquire(>lockdep_map);
>   lock_map_release(>lockdep_map);
> 
> - if (start_flush_work(work, )) {
> - wait_for_completion();
> - destroy_work_on_stack();
> - return true;
> - } else {
> - return false;
> - }
> + return __flush_work(work);
>  }
>  EXPORT_SYMBOL_GPL(flush_work);
> 
> @@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void 
> *arg)
> 
>   INIT_WORK_ONSTACK(, work_for_cpu_fn);
>   schedule_work_on(cpu, );
> - flush_work();
> +
> + /*
> +  * The work item is on-stack and can't lead to deadlock through
> +  * flushing.  Use __flush_work() to avoid spurious lockdep warnings
> +  * when work_on_cpu()s are nested.
> +  */
> + __flush_work();
> +
>   return wfc.ret;
>  }
>  EXPORT_SYMBOL_GPL(work_on_cpu);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] workqueue: allow work_on_cpu() to be called recursively

2013-07-27 Thread Srivatsa S. Bhat
On 07/24/2013 09:55 PM, Tejun Heo wrote:
 Applied to wq/for-3.11-fixes with comment and subject tweaks.
 
 Thanks!
 
 -- 8 
 
 From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001
 From: Lai Jiangshan la...@cn.fujitsu.com
 Date: Wed, 24 Jul 2013 18:31:42 +0800
 
 If the @fn call work_on_cpu() again, the lockdep will complain:
 
 [ INFO: possible recursive locking detected ]
 3.11.0-rc1-lockdep-fix-a #6 Not tainted
 -
 kworker/0:1/142 is trying to acquire lock:
  ((wfc.work)){+.+.+.}, at: [81077100] flush_work+0x0/0xb0

 but task is already holding lock:
  ((wfc.work)){+.+.+.}, at: [81075dd9] process_one_work+0x169/0x610

 other info that might help us debug this:
  Possible unsafe locking scenario:

CPU0

   lock((wfc.work));
   lock((wfc.work));

  *** DEADLOCK ***
 
 It is false-positive lockdep report. In this sutiation,
 the two wfcs of the two work_on_cpu() are different,
 they are both on stack. flush_work() can't be deadlock.
 
 To fix this, we need to avoid the lockdep checking in this case,
 thus we instroduce a internal __flush_work() which skip the lockdep.
 
 tj: Minor comment adjustment.
 
 Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com
 Reported-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
 Reported-by: Alexander Duyck alexander.h.du...@intel.com
 Signed-off-by: Tejun Heo t...@kernel.org
 ---

This version works as well, it fixes the issue I was facing.
Thank you!

FWIW:
Tested-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

Regards,
Srivatsa S. Bhat

  kernel/workqueue.c | 32 ++--
  1 file changed, 22 insertions(+), 10 deletions(-)
 
 diff --git a/kernel/workqueue.c b/kernel/workqueue.c
 index f02c4a4..55f5f0a 100644
 --- a/kernel/workqueue.c
 +++ b/kernel/workqueue.c
 @@ -2817,6 +2817,19 @@ already_gone:
   return false;
  }
 
 +static bool __flush_work(struct work_struct *work)
 +{
 + struct wq_barrier barr;
 +
 + if (start_flush_work(work, barr)) {
 + wait_for_completion(barr.done);
 + destroy_work_on_stack(barr.work);
 + return true;
 + } else {
 + return false;
 + }
 +}
 +
  /**
   * flush_work - wait for a work to finish executing the last queueing 
 instance
   * @work: the work to flush
 @@ -2830,18 +2843,10 @@ already_gone:
   */
  bool flush_work(struct work_struct *work)
  {
 - struct wq_barrier barr;
 -
   lock_map_acquire(work-lockdep_map);
   lock_map_release(work-lockdep_map);
 
 - if (start_flush_work(work, barr)) {
 - wait_for_completion(barr.done);
 - destroy_work_on_stack(barr.work);
 - return true;
 - } else {
 - return false;
 - }
 + return __flush_work(work);
  }
  EXPORT_SYMBOL_GPL(flush_work);
 
 @@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void 
 *arg)
 
   INIT_WORK_ONSTACK(wfc.work, work_for_cpu_fn);
   schedule_work_on(cpu, wfc.work);
 - flush_work(wfc.work);
 +
 + /*
 +  * The work item is on-stack and can't lead to deadlock through
 +  * flushing.  Use __flush_work() to avoid spurious lockdep warnings
 +  * when work_on_cpu()s are nested.
 +  */
 + __flush_work(wfc.work);
 +
   return wfc.ret;
  }
  EXPORT_SYMBOL_GPL(work_on_cpu);
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] workqueue: allow work_on_cpu() to be called recursively

2013-07-24 Thread Tejun Heo
Applied to wq/for-3.11-fixes with comment and subject tweaks.

Thanks!

-- 8< 

>From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan 
Date: Wed, 24 Jul 2013 18:31:42 +0800

If the @fn call work_on_cpu() again, the lockdep will complain:

> [ INFO: possible recursive locking detected ]
> 3.11.0-rc1-lockdep-fix-a #6 Not tainted
> -
> kworker/0:1/142 is trying to acquire lock:
>  (()){+.+.+.}, at: [] flush_work+0x0/0xb0
>
> but task is already holding lock:
>  (()){+.+.+.}, at: [] process_one_work+0x169/0x610
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>CPU0
>
>   lock(());
>   lock(());
>
>  *** DEADLOCK ***

It is false-positive lockdep report. In this sutiation,
the two "wfc"s of the two work_on_cpu() are different,
they are both on stack. flush_work() can't be deadlock.

To fix this, we need to avoid the lockdep checking in this case,
thus we instroduce a internal __flush_work() which skip the lockdep.

tj: Minor comment adjustment.

Signed-off-by: Lai Jiangshan 
Reported-by: "Srivatsa S. Bhat" 
Reported-by: Alexander Duyck 
Signed-off-by: Tejun Heo 
---
 kernel/workqueue.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f02c4a4..55f5f0a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2817,6 +2817,19 @@ already_gone:
return false;
 }
 
+static bool __flush_work(struct work_struct *work)
+{
+   struct wq_barrier barr;
+
+   if (start_flush_work(work, )) {
+   wait_for_completion();
+   destroy_work_on_stack();
+   return true;
+   } else {
+   return false;
+   }
+}
+
 /**
  * flush_work - wait for a work to finish executing the last queueing instance
  * @work: the work to flush
@@ -2830,18 +2843,10 @@ already_gone:
  */
 bool flush_work(struct work_struct *work)
 {
-   struct wq_barrier barr;
-
lock_map_acquire(>lockdep_map);
lock_map_release(>lockdep_map);
 
-   if (start_flush_work(work, )) {
-   wait_for_completion();
-   destroy_work_on_stack();
-   return true;
-   } else {
-   return false;
-   }
+   return __flush_work(work);
 }
 EXPORT_SYMBOL_GPL(flush_work);
 
@@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg)
 
INIT_WORK_ONSTACK(, work_for_cpu_fn);
schedule_work_on(cpu, );
-   flush_work();
+
+   /*
+* The work item is on-stack and can't lead to deadlock through
+* flushing.  Use __flush_work() to avoid spurious lockdep warnings
+* when work_on_cpu()s are nested.
+*/
+   __flush_work();
+
return wfc.ret;
 }
 EXPORT_SYMBOL_GPL(work_on_cpu);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] workqueue: allow work_on_cpu() to be called recursively

2013-07-24 Thread Tejun Heo
Applied to wq/for-3.11-fixes with comment and subject tweaks.

Thanks!

-- 8 

From c2fda509667b0fda4372a237f5a59ea4570b1627 Mon Sep 17 00:00:00 2001
From: Lai Jiangshan la...@cn.fujitsu.com
Date: Wed, 24 Jul 2013 18:31:42 +0800

If the @fn call work_on_cpu() again, the lockdep will complain:

 [ INFO: possible recursive locking detected ]
 3.11.0-rc1-lockdep-fix-a #6 Not tainted
 -
 kworker/0:1/142 is trying to acquire lock:
  ((wfc.work)){+.+.+.}, at: [81077100] flush_work+0x0/0xb0

 but task is already holding lock:
  ((wfc.work)){+.+.+.}, at: [81075dd9] process_one_work+0x169/0x610

 other info that might help us debug this:
  Possible unsafe locking scenario:

CPU0

   lock((wfc.work));
   lock((wfc.work));

  *** DEADLOCK ***

It is false-positive lockdep report. In this sutiation,
the two wfcs of the two work_on_cpu() are different,
they are both on stack. flush_work() can't be deadlock.

To fix this, we need to avoid the lockdep checking in this case,
thus we instroduce a internal __flush_work() which skip the lockdep.

tj: Minor comment adjustment.

Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com
Reported-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Reported-by: Alexander Duyck alexander.h.du...@intel.com
Signed-off-by: Tejun Heo t...@kernel.org
---
 kernel/workqueue.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index f02c4a4..55f5f0a 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2817,6 +2817,19 @@ already_gone:
return false;
 }
 
+static bool __flush_work(struct work_struct *work)
+{
+   struct wq_barrier barr;
+
+   if (start_flush_work(work, barr)) {
+   wait_for_completion(barr.done);
+   destroy_work_on_stack(barr.work);
+   return true;
+   } else {
+   return false;
+   }
+}
+
 /**
  * flush_work - wait for a work to finish executing the last queueing instance
  * @work: the work to flush
@@ -2830,18 +2843,10 @@ already_gone:
  */
 bool flush_work(struct work_struct *work)
 {
-   struct wq_barrier barr;
-
lock_map_acquire(work-lockdep_map);
lock_map_release(work-lockdep_map);
 
-   if (start_flush_work(work, barr)) {
-   wait_for_completion(barr.done);
-   destroy_work_on_stack(barr.work);
-   return true;
-   } else {
-   return false;
-   }
+   return __flush_work(work);
 }
 EXPORT_SYMBOL_GPL(flush_work);
 
@@ -4756,7 +4761,14 @@ long work_on_cpu(int cpu, long (*fn)(void *), void *arg)
 
INIT_WORK_ONSTACK(wfc.work, work_for_cpu_fn);
schedule_work_on(cpu, wfc.work);
-   flush_work(wfc.work);
+
+   /*
+* The work item is on-stack and can't lead to deadlock through
+* flushing.  Use __flush_work() to avoid spurious lockdep warnings
+* when work_on_cpu()s are nested.
+*/
+   __flush_work(wfc.work);
+
return wfc.ret;
 }
 EXPORT_SYMBOL_GPL(work_on_cpu);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/