2015-03-19 0:21 GMT+09:00 Mark Rutland <mark.rutl...@arm.com>: > Hi, > >> > do { >> > tid = this_cpu_read(s->cpu_slab->tid); >> > c = raw_cpu_ptr(s->cpu_slab); >> > - } while (IS_ENABLED(CONFIG_PREEMPT) && unlikely(tid != c->tid)); >> > + } while (IS_ENABLED(CONFIG_PREEMPT) && >> > + unlikely(tid != READ_ONCE(c->tid))); > > [...] > >> Could you show me generated code again? > > The code generated without this patch in !SMP && PREEMPT kernels is: > > /* Hoisted load of c->tid */ > ffffffc00016d3c4: f9400404 ldr x4, [x0,#8] > /* this_cpu_read(s->cpu_slab->tid)) -- buggy, see [1] */ > ffffffc00016d3c8: f9400401 ldr x1, [x0,#8] > ffffffc00016d3cc: eb04003f cmp x1, x4 > ffffffc00016d3d0: 54ffffc1 b.ne ffffffc00016d3c8 > <slab_alloc_node.constprop.82+0x30> > > The code generated with this patch in !SMP && PREEMPT kernels is: > > /* this_cpu_read(s->cpu_slab->tid)) -- buggy, see [1] */ > ffffffc00016d3c4: f9400401 ldr x1, [x0,#8] > /* load of c->tid */ > ffffffc00016d3c8: f9400404 ldr x4, [x0,#8] > ffffffc00016d3cc: eb04003f cmp x1, x4 > ffffffc00016d3d0: 54ffffa1 b.ne ffffffc00016d3c4 > <slab_alloc_node.constprop.82+0x2c> > > Note that with the patch the branch results in both loads being > performed again. > Given that in !SMP kernels we know that the loads _must_ happen on the > same CPU, I think we could go a bit further with the loop condition: > > while (IS_ENABLED(CONFIG_PREEMPT) && > !IS_ENABLED(CONFIG_SMP) && > unlikely(tid != READ_ONCE(c->tid))); > > The barrier afterwards should be sufficient to order the load of the tid > against subsequent accesses to the other cpu_slab fields. > >> What we need to check is redoing whole things in the loop. >> Previous attached code seems to me that it already did >> refetching c->tid in the loop and this patch looks only handle >> refetching c->tid. > > The refetch in the loop is this_cpu_read(s->cpu_slab->tid), not the load > of c->tid (which is hoisted above the loop).
Okay. Now, I'm fine with your change. >> READ_ONCE(c->tid) will trigger redoing 'tid = >> this_cpu_read(s->cpu_slab->tid)'? > > I was under the impression that this_cpu operations would always result > in an access, much like the *_ONCE accessors, so we should aways redo > the access for this_cpu_read(s->cpu_slab->tid). Is that not the case? I'm not the expert on that operation. Christoph could answer it. > Mark. > > [1] The arm64 this_cpu * operations are currently buggy. We generate the > percpu address into a register, then perform the access with > separate instructions (and could be preempted between the two). > Steve Capper is currently fixing this. > > However, the hoisting of the c->tid load could happen regardless, > whenever raw_cpu_ptr(c) can be evaluated at compile time. Okay. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/