Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Prarit Bhargava


On 02/28/2014 07:39 AM, Peter Zijlstra wrote:
> On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote:
>> On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
>>>   kernel/stop_machine.c:stop_two_cpus()
>>
>> That site should work with .wait=1 just fine, but given the above, the
>> .wait=0 doesn't appear problematic at all.
> 
> Scratch that; its broken, but not because of smp_call_function_single().
> 
> ---
> Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
> 
> We must use smp_call_function_single(.wait=1) for the
> irq_cpu_stop_queue_work() to ensure the queueing is actually done under
> stop_cpus_lock. Without this we could have dropped the lock by the time
> we do the queueing and get the race we tried to fix.
> 
> Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and 
> stop_cpus()")
> Cc: Prarit Bhargava 
> Cc: Rik van Riel 
> Cc: Mel Gorman 
> Signed-off-by: Peter Zijlstra 

Reviewed-by: Prarit Bhargava 

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Rik van Riel
On 02/28/2014 07:39 AM, Peter Zijlstra wrote:

> Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
> 
> We must use smp_call_function_single(.wait=1) for the
> irq_cpu_stop_queue_work() to ensure the queueing is actually done under
> stop_cpus_lock. Without this we could have dropped the lock by the time
> we do the queueing and get the race we tried to fix.
> 
> Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and 
> stop_cpus()")
> Cc: Prarit Bhargava 
> Cc: Rik van Riel 
> Cc: Mel Gorman 
> Signed-off-by: Peter Zijlstra 

Reviewed-by: Rik van Riel 

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Peter Zijlstra
On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
> >   kernel/stop_machine.c:stop_two_cpus()
> 
> That site should work with .wait=1 just fine, but given the above, the
> .wait=0 doesn't appear problematic at all.

Scratch that; its broken, but not because of smp_call_function_single().

---
Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()

We must use smp_call_function_single(.wait=1) for the
irq_cpu_stop_queue_work() to ensure the queueing is actually done under
stop_cpus_lock. Without this we could have dropped the lock by the time
we do the queueing and get the race we tried to fix.

Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and 
stop_cpus()")
Cc: Prarit Bhargava 
Cc: Rik van Riel 
Cc: Mel Gorman 
Signed-off-by: Peter Zijlstra 
---
 kernel/stop_machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 84571e09c907..01fbae5b97b7 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -293,7 +293,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, 
cpu_stop_fn_t fn, void *
 */
smp_call_function_single(min(cpu1, cpu2),
 _cpu_stop_queue_work,
-_args, 0);
+_args, 1);
lg_local_unlock(_cpus_lock);
preempt_enable();
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Peter Zijlstra
On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
> While doing my recent work on the generic smp function calls I noticed
> that smp_call_function_single without the wait flag can't work, as
> it allocates struct call_single_data on stack, and without the wait
> flag will happily return before the IPI has been executed.

It doesn't actually; it uses a per-cpu one in the !wait case.

The subsequent csd_lock() ensures it will wait for any prior user to
complete, so only if you're doing multiple smp_call_function_single()
invocations back-to-back will they queue up.

> This affects the following callers:



>   kernel/stop_machine.c:stop_two_cpus()

That site should work with .wait=1 just fine, but given the above, the
.wait=0 doesn't appear problematic at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Peter Zijlstra
On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
 While doing my recent work on the generic smp function calls I noticed
 that smp_call_function_single without the wait flag can't work, as
 it allocates struct call_single_data on stack, and without the wait
 flag will happily return before the IPI has been executed.

It doesn't actually; it uses a per-cpu one in the !wait case.

The subsequent csd_lock() ensures it will wait for any prior user to
complete, so only if you're doing multiple smp_call_function_single()
invocations back-to-back will they queue up.

 This affects the following callers:

snip

   kernel/stop_machine.c:stop_two_cpus()

That site should work with .wait=1 just fine, but given the above, the
.wait=0 doesn't appear problematic at all.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Peter Zijlstra
On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote:
 On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
kernel/stop_machine.c:stop_two_cpus()
 
 That site should work with .wait=1 just fine, but given the above, the
 .wait=0 doesn't appear problematic at all.

Scratch that; its broken, but not because of smp_call_function_single().

---
Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()

We must use smp_call_function_single(.wait=1) for the
irq_cpu_stop_queue_work() to ensure the queueing is actually done under
stop_cpus_lock. Without this we could have dropped the lock by the time
we do the queueing and get the race we tried to fix.

Fixes: 7053ea1a34fa (stop_machine: Fix race between stop_two_cpus() and 
stop_cpus())
Cc: Prarit Bhargava pra...@redhat.com
Cc: Rik van Riel r...@redhat.com
Cc: Mel Gorman mgor...@suse.de
Signed-off-by: Peter Zijlstra pet...@infradead.org
---
 kernel/stop_machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 84571e09c907..01fbae5b97b7 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -293,7 +293,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, 
cpu_stop_fn_t fn, void *
 */
smp_call_function_single(min(cpu1, cpu2),
 irq_cpu_stop_queue_work,
-call_args, 0);
+call_args, 1);
lg_local_unlock(stop_cpus_lock);
preempt_enable();
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Rik van Riel
On 02/28/2014 07:39 AM, Peter Zijlstra wrote:

 Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
 
 We must use smp_call_function_single(.wait=1) for the
 irq_cpu_stop_queue_work() to ensure the queueing is actually done under
 stop_cpus_lock. Without this we could have dropped the lock by the time
 we do the queueing and get the race we tried to fix.
 
 Fixes: 7053ea1a34fa (stop_machine: Fix race between stop_two_cpus() and 
 stop_cpus())
 Cc: Prarit Bhargava pra...@redhat.com
 Cc: Rik van Riel r...@redhat.com
 Cc: Mel Gorman mgor...@suse.de
 Signed-off-by: Peter Zijlstra pet...@infradead.org

Reviewed-by: Rik van Riel r...@redhat.com

-- 
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2014-02-28 Thread Prarit Bhargava


On 02/28/2014 07:39 AM, Peter Zijlstra wrote:
 On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote:
 On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote:
   kernel/stop_machine.c:stop_two_cpus()

 That site should work with .wait=1 just fine, but given the above, the
 .wait=0 doesn't appear problematic at all.
 
 Scratch that; its broken, but not because of smp_call_function_single().
 
 ---
 Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
 
 We must use smp_call_function_single(.wait=1) for the
 irq_cpu_stop_queue_work() to ensure the queueing is actually done under
 stop_cpus_lock. Without this we could have dropped the lock by the time
 we do the queueing and get the race we tried to fix.
 
 Fixes: 7053ea1a34fa (stop_machine: Fix race between stop_two_cpus() and 
 stop_cpus())
 Cc: Prarit Bhargava pra...@redhat.com
 Cc: Rik van Riel r...@redhat.com
 Cc: Mel Gorman mgor...@suse.de
 Signed-off-by: Peter Zijlstra pet...@infradead.org

Reviewed-by: Prarit Bhargava pra...@redhat.com

P.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2013-12-06 Thread Christoph Hellwig
On Thu, Dec 05, 2013 at 02:43:03PM -0700, Bjorn Helgaas wrote:
> smp_call_function_single() sets "csd = &__get_cpu_var(csd_data)", so
> it's not using a struct on the stack.  We'll queue up "func" and
> likely will return before it is executed, but that should be fine
> because nobody will overwrite csd_data until it *is* executed and
> csd_unlock() has been called.

You're right, I missed the usage of the per-cpu data later in the
function.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2013-12-06 Thread Christoph Hellwig
On Thu, Dec 05, 2013 at 02:43:03PM -0700, Bjorn Helgaas wrote:
 smp_call_function_single() sets csd = __get_cpu_var(csd_data), so
 it's not using a struct on the stack.  We'll queue up func and
 likely will return before it is executed, but that should be fine
 because nobody will overwrite csd_data until it *is* executed and
 csd_unlock() has been called.

You're right, I missed the usage of the per-cpu data later in the
function.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2013-12-05 Thread Bjorn Helgaas
On Wed, Dec 4, 2013 at 9:46 AM, Christoph Hellwig  wrote:
> While doing my recent work on the generic smp function calls I noticed
> that smp_call_function_single without the wait flag can't work, as
> it allocates struct call_single_data on stack, and without the wait
> flag will happily return before the IPI has been executed.

I don't understand the problem yet.  With wait==0,
smp_call_function_single() sets "csd = &__get_cpu_var(csd_data)", so
it's not using a struct on the stack.  We'll queue up "func" and
likely will return before it is executed, but that should be fine
because nobody will overwrite csd_data until it *is* executed and
csd_unlock() has been called.

> This affects the following callers:
>
>   arch/ia64/kernel/mca.c:mca_cpu_callback()
>   arch/ia64/kernel/smpboot.c:ia64_sync_itc()
>   arch/x86/kernel/kvm.c:kvm_cpu_notify()
>   arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier()
>   arch/x86/pci/amd_bus.c:amd_cpu_notify()

I don't see any reason why amd_cpu_notify() needs to use wait==0.

>   drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu()
>   kernel/stop_machine.c:stop_two_cpus()
>
> It would be good to get these fixed so that we could remove the
> parameter.  Either convert them to wait, or use a preallocated
> call_single_data and __smp_call_function_single.
>
> After that I'd like to remove the wait argument to prevent further
> abuses.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: smp_call_function_single with wait=0 considered harmful

2013-12-05 Thread Bjorn Helgaas
On Wed, Dec 4, 2013 at 9:46 AM, Christoph Hellwig h...@infradead.org wrote:
 While doing my recent work on the generic smp function calls I noticed
 that smp_call_function_single without the wait flag can't work, as
 it allocates struct call_single_data on stack, and without the wait
 flag will happily return before the IPI has been executed.

I don't understand the problem yet.  With wait==0,
smp_call_function_single() sets csd = __get_cpu_var(csd_data), so
it's not using a struct on the stack.  We'll queue up func and
likely will return before it is executed, but that should be fine
because nobody will overwrite csd_data until it *is* executed and
csd_unlock() has been called.

 This affects the following callers:

   arch/ia64/kernel/mca.c:mca_cpu_callback()
   arch/ia64/kernel/smpboot.c:ia64_sync_itc()
   arch/x86/kernel/kvm.c:kvm_cpu_notify()
   arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier()
   arch/x86/pci/amd_bus.c:amd_cpu_notify()

I don't see any reason why amd_cpu_notify() needs to use wait==0.

   drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu()
   kernel/stop_machine.c:stop_two_cpus()

 It would be good to get these fixed so that we could remove the
 parameter.  Either convert them to wait, or use a preallocated
 call_single_data and __smp_call_function_single.

 After that I'd like to remove the wait argument to prevent further
 abuses.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


smp_call_function_single with wait=0 considered harmful

2013-12-04 Thread Christoph Hellwig
While doing my recent work on the generic smp function calls I noticed
that smp_call_function_single without the wait flag can't work, as
it allocates struct call_single_data on stack, and without the wait
flag will happily return before the IPI has been executed.

This affects the following callers:

  arch/ia64/kernel/mca.c:mca_cpu_callback()
  arch/ia64/kernel/smpboot.c:ia64_sync_itc()
  arch/x86/kernel/kvm.c:kvm_cpu_notify()
  arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier()
  arch/x86/pci/amd_bus.c:amd_cpu_notify()
  drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu()
  kernel/stop_machine.c:stop_two_cpus()

It would be good to get these fixed so that we could remove the
parameter.  Either convert them to wait, or use a preallocated
call_single_data and __smp_call_function_single.

After that I'd like to remove the wait argument to prevent further
abuses.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


smp_call_function_single with wait=0 considered harmful

2013-12-04 Thread Christoph Hellwig
While doing my recent work on the generic smp function calls I noticed
that smp_call_function_single without the wait flag can't work, as
it allocates struct call_single_data on stack, and without the wait
flag will happily return before the IPI has been executed.

This affects the following callers:

  arch/ia64/kernel/mca.c:mca_cpu_callback()
  arch/ia64/kernel/smpboot.c:ia64_sync_itc()
  arch/x86/kernel/kvm.c:kvm_cpu_notify()
  arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier()
  arch/x86/pci/amd_bus.c:amd_cpu_notify()
  drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu()
  kernel/stop_machine.c:stop_two_cpus()

It would be good to get these fixed so that we could remove the
parameter.  Either convert them to wait, or use a preallocated
call_single_data and __smp_call_function_single.

After that I'd like to remove the wait argument to prevent further
abuses.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/