On Tue, 16 Jun 2020 08:33:33 +0200 Cornelia Huck <coh...@redhat.com> wrote:
> > #define atomic_cmpxchg__nocheck(ptr, old, new) ({ \ > > > > > > typeof_strip_qual(*ptr) _old = (old); \ > > > > > > (void)__atomic_compare_exchange_n(ptr, &_old, new, false, \ > > > > > > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); \ > > > > > > _old; \ > > > > > > }) > > > > ind_old is copied into _old in the macro. Instead of doing the copy from the > > register the compiler reloads the value from memory. The result is that _old > > and ind_old end up having different values. _old in r1 with the bits set > > already and ind_old in r10 with the bits cleared. _old gets updated by CS > > and matches ind_old afterwards - both with the bits being 0. So the != > > compare is false and the loop is left without having set any bits. > > > > > > Paolo (to), > > I am asking myself if it would be safer to add a barrier or something like > > this in the macros in include/qemu/atomic.h. > > I'm also wondering whether this has been seen on other architectures as > well? There are also some callers in non-s390x code, and dealing with > this in common code would catch them as well. Quite a bunch of users use something like old = atomic_read(..), where atomic_read is documented as in docs/devel/atomics.rst: - ``atomic_read()`` and ``atomic_set()``; these prevent the compiler from optimizing accesses out of existence and creating unsolicited accesses, but do not otherwise impose any ordering on loads and stores: both the compiler and the processor are free to reorder them. Maybe I should have used that instead of volatile, but my problem was that I didn't fully understand what atomic_read() does, and if it does more than we need. I found the documentation just now. Another bunch uses constants as old, which is also fine. And there is a third bunch where I don't know whats up, partly because I did not dig deep enough. Regards, Halil