On Thu, 18 Feb 2021 at 15:15, Jason A. Donenfeld <ja...@zx2c4.com> wrote: >
[...] > > > > > +static void __wg_prev_queue_enqueue(struct prev_queue *queue, struct > > > sk_buff *skb) > > > +{ > > > + WRITE_ONCE(NEXT(skb), NULL); > > > + WRITE_ONCE(NEXT(xchg_release(&queue->head, skb)), skb); > > > +} > > > > > > Look good? > > > > > > > Yes, exactly like that! > > The downside is that on armv7, this becomes a dmb(ish) instead of a > dmb(ishst). But I was unable to measure any actual difference anyway, > and the atomic bounded increment is already more expensive, so I think > it's okay. > Who cares about armv7!? The world is moving to Armv8/LSE, where we'll end up with one fine "swpl" in this case, w/o any explicit (well...) fence. ;-P On a more serious note, it does make sense to base the decision on benchmarks. OTOH I'd guess that the systems that mostly benefit from this memory saving patch are x86_64, where the smp_wmb()/xchg_relaxed() and xchg_release() are identical. Björn