On Sat, Apr 18, 2015 at 7:40 PM, Guenter Roeck <li...@roeck-us.net> wrote: > On Sat, Apr 18, 2015 at 04:23:25PM -0700, Guenter Roeck wrote: >> >> my qemu test for arm:vexpress fails with the latest upstream kernel. It fails >> hard - I don't get any output from the console. Bisect points to commit >> 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking"). >> Reverting this commit fixes the problem.
Hmm. It being qemu, can you look at where it seems to lock? > Additional observation: The system boots if I add "-smp cpus=4" to the qemu > options. It does still hang, however, with "-smp cpus=2" and "-smp cpus=3". Funky. That patch still looks obviously correct to me after looking at it again, but I guess we need to revert it if somebody can't see what's wrong. It does make async (wait=0) smp_call_function_single() possibly be *really* asynchronous, ie the 'csd' ends up being released and can be re-used even before the call-single function has completed. That should be a good thing, but I wonder if that triggers some ARM bug. Instead of doing a full revert, what happens if you replace this part: + /* Do we wait until *after* callback? */ + if (csd->flags & CSD_FLAG_SYNCHRONOUS) { + func(info); + csd_unlock(csd); + } else { + csd_unlock(csd); + func(info); + } with just + func(info); + csd_unlock(csd); ie keeping the csd locked until the function has actually completed? I guess for completeness, we should do the same thing for the cpu == smp_processor_id() case (see the "We can unlock early" comment). Now, if that makes a difference, I think it implies a bug in the caller, so it's not the right fix, but it would be an interesting thing to test. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/