> > +     (1) The compiler can reorder the load from a to precede the
> > +     atomic_dec(), (2) Because x86 smp_mb__before_atomic() is only a
> > +     compiler barrier, the CPU can reorder the preceding store to
> > +     obj->dead with the later load from a.
> > +
> > +     This could be avoided by using READ_ONCE(), which would prevent the
> > +     compiler from reordering due to both atomic_dec() and READ_ONCE()
> > +     being volatile accesses, and is usually preferable for loads from
> > +     shared variables.  However, weakly ordered CPUs would still be
> > +     free to reorder the atomic_dec() with the load from a, so a more
> > +     readable option is to also use smp_mb__after_atomic() as follows:
> > +
> > +   WRITE_ONCE(obj->dead, 1);
> > +   smp_mb__before_atomic();
> > +   atomic_dec(&obj->ref_count);
> > +   smp_mb__after_atomic();
> > +   r1 = READ_ONCE(a);
> 
> The point here is not just "readability", but also the portability of the
> code, isn't it?

The implicit assumption was, I guess, that all weakly ordered CPUs which
are free to reorder the atomic_dec() with the READ_ONCE() execute a full
memory barrier in smp_mb__before_atomic() ...  This assumption currently
holds, AFAICT, but yes: it may well become "non-portable"! ... ;-)

  Andrea

Reply via email to