On Wed, 4 Oct 2017 11:08:30 +0200 Peter Zijlstra <[email protected]> wrote:
> On Tue, Oct 03, 2017 at 06:24:22PM -0400, Steven Rostedt wrote: > > On Thu, 28 Sep 2017 14:18:26 +0200 > > Peter Zijlstra <[email protected]> wrote: > > > static int early_vprintk(const char *fmt, va_list args) > > > { > > > + int n, cpu, old; > > > char buf[512]; > > > + > > > + cpu = get_cpu(); > > > + /* > > > + * Test-and-Set inter-cpu spinlock with recursion. > > > + */ > > > + for (;;) { > > > + /* > > > + * c-cas to avoid the exclusive bouncing on spin. > > > + * Depends on the memory barrier implied by cmpxchg > > > + * for ACQUIRE semantics. > > > + */ > > > + old = READ_ONCE(early_printk_cpu); > > > + if (old == -1) { > > > > If old != -1 and old != cpu, is it possible that the CPU could have > > fetched an old value, and never try to fetch it again? > > What? If old != -1 and old != cpu, we'll hit the cpu_relax() and do the > READ_ONCE() again. The READ_ONCE() guarantees we'll do the load again, > as does the barrier() implied by cpu_relax(). I'm more worried about other architectures that don't have as strong of a cache coherency. [ Added Paul as he knows a lot about odd architectures ] Is there any architecture that we support that can have the following: CPU0 CPU1 ---- ---- early_printk_cpu = 1 for (;;) old = READ_ONCE(early_printk_cpu); [ old = 1 ] early_printk_cpu = -1 [...] cpu_relax(); old = READ_ONCE(early_printk_cpu); [ but the CPU uses the cache and not the memory? ] old = 1; -- Steve > > > The cmpxchg memory barrier only happens when old == -1. > > Yeah, so? > > > > + old = cmpxchg(&early_printk_cpu, -1, cpu); > > > + if (old == -1) > > > + break; > > > + } > > > + /* > > > + * Allow recursion for interrupts and the like. > > > + */ > > > + if (old == cpu) > > > + break; > > > + > > > + cpu_relax(); > > > + } > > > > > > n = vscnprintf(buf, sizeof(buf), fmt, args); > > > early_console->write(early_console, buf, n); > > > > > > + /* > > > + * Unlock -- in case @old == @cpu, this is a no-op. > > > + */ > > > + smp_store_release(&early_printk_cpu, old); > > > + put_cpu(); > > > + > > > return n; > > > }

