Re: smp_call_function_single with wait=0 considered harmful
On 02/28/2014 07:39 AM, Peter Zijlstra wrote: > On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote: >> On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote: >>> kernel/stop_machine.c:stop_two_cpus() >> >> That site should work with .wait=1 just fine, but given the above, the >> .wait=0 doesn't appear problematic at all. > > Scratch that; its broken, but not because of smp_call_function_single(). > > --- > Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() > > We must use smp_call_function_single(.wait=1) for the > irq_cpu_stop_queue_work() to ensure the queueing is actually done under > stop_cpus_lock. Without this we could have dropped the lock by the time > we do the queueing and get the race we tried to fix. > > Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and > stop_cpus()") > Cc: Prarit Bhargava > Cc: Rik van Riel > Cc: Mel Gorman > Signed-off-by: Peter Zijlstra Reviewed-by: Prarit Bhargava P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On 02/28/2014 07:39 AM, Peter Zijlstra wrote: > Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() > > We must use smp_call_function_single(.wait=1) for the > irq_cpu_stop_queue_work() to ensure the queueing is actually done under > stop_cpus_lock. Without this we could have dropped the lock by the time > we do the queueing and get the race we tried to fix. > > Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and > stop_cpus()") > Cc: Prarit Bhargava > Cc: Rik van Riel > Cc: Mel Gorman > Signed-off-by: Peter Zijlstra Reviewed-by: Rik van Riel -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote: > On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote: > > kernel/stop_machine.c:stop_two_cpus() > > That site should work with .wait=1 just fine, but given the above, the > .wait=0 doesn't appear problematic at all. Scratch that; its broken, but not because of smp_call_function_single(). --- Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() We must use smp_call_function_single(.wait=1) for the irq_cpu_stop_queue_work() to ensure the queueing is actually done under stop_cpus_lock. Without this we could have dropped the lock by the time we do the queueing and get the race we tried to fix. Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()") Cc: Prarit Bhargava Cc: Rik van Riel Cc: Mel Gorman Signed-off-by: Peter Zijlstra --- kernel/stop_machine.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 84571e09c907..01fbae5b97b7 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -293,7 +293,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void * */ smp_call_function_single(min(cpu1, cpu2), _cpu_stop_queue_work, -_args, 0); +_args, 1); lg_local_unlock(_cpus_lock); preempt_enable(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote: > While doing my recent work on the generic smp function calls I noticed > that smp_call_function_single without the wait flag can't work, as > it allocates struct call_single_data on stack, and without the wait > flag will happily return before the IPI has been executed. It doesn't actually; it uses a per-cpu one in the !wait case. The subsequent csd_lock() ensures it will wait for any prior user to complete, so only if you're doing multiple smp_call_function_single() invocations back-to-back will they queue up. > This affects the following callers: > kernel/stop_machine.c:stop_two_cpus() That site should work with .wait=1 just fine, but given the above, the .wait=0 doesn't appear problematic at all. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote: While doing my recent work on the generic smp function calls I noticed that smp_call_function_single without the wait flag can't work, as it allocates struct call_single_data on stack, and without the wait flag will happily return before the IPI has been executed. It doesn't actually; it uses a per-cpu one in the !wait case. The subsequent csd_lock() ensures it will wait for any prior user to complete, so only if you're doing multiple smp_call_function_single() invocations back-to-back will they queue up. This affects the following callers: snip kernel/stop_machine.c:stop_two_cpus() That site should work with .wait=1 just fine, but given the above, the .wait=0 doesn't appear problematic at all. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote: On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote: kernel/stop_machine.c:stop_two_cpus() That site should work with .wait=1 just fine, but given the above, the .wait=0 doesn't appear problematic at all. Scratch that; its broken, but not because of smp_call_function_single(). --- Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() We must use smp_call_function_single(.wait=1) for the irq_cpu_stop_queue_work() to ensure the queueing is actually done under stop_cpus_lock. Without this we could have dropped the lock by the time we do the queueing and get the race we tried to fix. Fixes: 7053ea1a34fa (stop_machine: Fix race between stop_two_cpus() and stop_cpus()) Cc: Prarit Bhargava pra...@redhat.com Cc: Rik van Riel r...@redhat.com Cc: Mel Gorman mgor...@suse.de Signed-off-by: Peter Zijlstra pet...@infradead.org --- kernel/stop_machine.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 84571e09c907..01fbae5b97b7 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -293,7 +293,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void * */ smp_call_function_single(min(cpu1, cpu2), irq_cpu_stop_queue_work, -call_args, 0); +call_args, 1); lg_local_unlock(stop_cpus_lock); preempt_enable(); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On 02/28/2014 07:39 AM, Peter Zijlstra wrote: Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() We must use smp_call_function_single(.wait=1) for the irq_cpu_stop_queue_work() to ensure the queueing is actually done under stop_cpus_lock. Without this we could have dropped the lock by the time we do the queueing and get the race we tried to fix. Fixes: 7053ea1a34fa (stop_machine: Fix race between stop_two_cpus() and stop_cpus()) Cc: Prarit Bhargava pra...@redhat.com Cc: Rik van Riel r...@redhat.com Cc: Mel Gorman mgor...@suse.de Signed-off-by: Peter Zijlstra pet...@infradead.org Reviewed-by: Rik van Riel r...@redhat.com -- All rights reversed -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On 02/28/2014 07:39 AM, Peter Zijlstra wrote: On Fri, Feb 28, 2014 at 01:26:24PM +0100, Peter Zijlstra wrote: On Wed, Dec 04, 2013 at 08:46:27AM -0800, Christoph Hellwig wrote: kernel/stop_machine.c:stop_two_cpus() That site should work with .wait=1 just fine, but given the above, the .wait=0 doesn't appear problematic at all. Scratch that; its broken, but not because of smp_call_function_single(). --- Subject: stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() We must use smp_call_function_single(.wait=1) for the irq_cpu_stop_queue_work() to ensure the queueing is actually done under stop_cpus_lock. Without this we could have dropped the lock by the time we do the queueing and get the race we tried to fix. Fixes: 7053ea1a34fa (stop_machine: Fix race between stop_two_cpus() and stop_cpus()) Cc: Prarit Bhargava pra...@redhat.com Cc: Rik van Riel r...@redhat.com Cc: Mel Gorman mgor...@suse.de Signed-off-by: Peter Zijlstra pet...@infradead.org Reviewed-by: Prarit Bhargava pra...@redhat.com P. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Thu, Dec 05, 2013 at 02:43:03PM -0700, Bjorn Helgaas wrote: > smp_call_function_single() sets "csd = &__get_cpu_var(csd_data)", so > it's not using a struct on the stack. We'll queue up "func" and > likely will return before it is executed, but that should be fine > because nobody will overwrite csd_data until it *is* executed and > csd_unlock() has been called. You're right, I missed the usage of the per-cpu data later in the function. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Thu, Dec 05, 2013 at 02:43:03PM -0700, Bjorn Helgaas wrote: smp_call_function_single() sets csd = __get_cpu_var(csd_data), so it's not using a struct on the stack. We'll queue up func and likely will return before it is executed, but that should be fine because nobody will overwrite csd_data until it *is* executed and csd_unlock() has been called. You're right, I missed the usage of the per-cpu data later in the function. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Wed, Dec 4, 2013 at 9:46 AM, Christoph Hellwig wrote: > While doing my recent work on the generic smp function calls I noticed > that smp_call_function_single without the wait flag can't work, as > it allocates struct call_single_data on stack, and without the wait > flag will happily return before the IPI has been executed. I don't understand the problem yet. With wait==0, smp_call_function_single() sets "csd = &__get_cpu_var(csd_data)", so it's not using a struct on the stack. We'll queue up "func" and likely will return before it is executed, but that should be fine because nobody will overwrite csd_data until it *is* executed and csd_unlock() has been called. > This affects the following callers: > > arch/ia64/kernel/mca.c:mca_cpu_callback() > arch/ia64/kernel/smpboot.c:ia64_sync_itc() > arch/x86/kernel/kvm.c:kvm_cpu_notify() > arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier() > arch/x86/pci/amd_bus.c:amd_cpu_notify() I don't see any reason why amd_cpu_notify() needs to use wait==0. > drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu() > kernel/stop_machine.c:stop_two_cpus() > > It would be good to get these fixed so that we could remove the > parameter. Either convert them to wait, or use a preallocated > call_single_data and __smp_call_function_single. > > After that I'd like to remove the wait argument to prevent further > abuses. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: smp_call_function_single with wait=0 considered harmful
On Wed, Dec 4, 2013 at 9:46 AM, Christoph Hellwig h...@infradead.org wrote: While doing my recent work on the generic smp function calls I noticed that smp_call_function_single without the wait flag can't work, as it allocates struct call_single_data on stack, and without the wait flag will happily return before the IPI has been executed. I don't understand the problem yet. With wait==0, smp_call_function_single() sets csd = __get_cpu_var(csd_data), so it's not using a struct on the stack. We'll queue up func and likely will return before it is executed, but that should be fine because nobody will overwrite csd_data until it *is* executed and csd_unlock() has been called. This affects the following callers: arch/ia64/kernel/mca.c:mca_cpu_callback() arch/ia64/kernel/smpboot.c:ia64_sync_itc() arch/x86/kernel/kvm.c:kvm_cpu_notify() arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier() arch/x86/pci/amd_bus.c:amd_cpu_notify() I don't see any reason why amd_cpu_notify() needs to use wait==0. drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu() kernel/stop_machine.c:stop_two_cpus() It would be good to get these fixed so that we could remove the parameter. Either convert them to wait, or use a preallocated call_single_data and __smp_call_function_single. After that I'd like to remove the wait argument to prevent further abuses. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
smp_call_function_single with wait=0 considered harmful
While doing my recent work on the generic smp function calls I noticed that smp_call_function_single without the wait flag can't work, as it allocates struct call_single_data on stack, and without the wait flag will happily return before the IPI has been executed. This affects the following callers: arch/ia64/kernel/mca.c:mca_cpu_callback() arch/ia64/kernel/smpboot.c:ia64_sync_itc() arch/x86/kernel/kvm.c:kvm_cpu_notify() arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier() arch/x86/pci/amd_bus.c:amd_cpu_notify() drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu() kernel/stop_machine.c:stop_two_cpus() It would be good to get these fixed so that we could remove the parameter. Either convert them to wait, or use a preallocated call_single_data and __smp_call_function_single. After that I'd like to remove the wait argument to prevent further abuses. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
smp_call_function_single with wait=0 considered harmful
While doing my recent work on the generic smp function calls I noticed that smp_call_function_single without the wait flag can't work, as it allocates struct call_single_data on stack, and without the wait flag will happily return before the IPI has been executed. This affects the following callers: arch/ia64/kernel/mca.c:mca_cpu_callback() arch/ia64/kernel/smpboot.c:ia64_sync_itc() arch/x86/kernel/kvm.c:kvm_cpu_notify() arch/x86/oprofile/nmi_int.c:oprofile_cpu_notifier() arch/x86/pci/amd_bus.c:amd_cpu_notify() drivers/staging/octeon/ethernet-rx.c:cvm_oct_enable_one_cpu() kernel/stop_machine.c:stop_two_cpus() It would be good to get these fixed so that we could remove the parameter. Either convert them to wait, or use a preallocated call_single_data and __smp_call_function_single. After that I'd like to remove the wait argument to prevent further abuses. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/