On Tue, Sep 02, 2025 at 07:56:58AM +0200, Jakub Jelinek wrote:
> On Tue, Sep 02, 2025 at 09:32:15AM +0530, Surya Kumari Jangala wrote:
> > Ping.
> > 
> > Please review.
> 
> IMHO we shouldn't introduce new builtins for this, but instead
> use new flag bits in the upper bits of the memorder arguments.
> See how x86 uses __ATOMIC_HLE_ACQUIRE and __ATOMIC_HLE_RELEASE
> in there.  x86-Specific Memory Model Extensions for Transactional Memory
> in the documentation provides an example.

This doesn't change the memorder (or anything else): a load-locked with
EH=1 is semantically equivalent to a load-locked with EH=0.  It just
behaves differently in the microarchitecture so there is a different
performance profile with it.  EH=1 tells the implementation to optimise
for the case that no other agent is involved at all:

        EH = 1 should be used when the program is obtaining
        a lock variable which it will subsequently release
        before another program attempts to perform a store
        to it. When contention for a lock is significant,
        using this hint may reduce the number of times a
        cache block is transferred between processor caches.

Hiding this in an unrelated parameter is obfuscation.  I'd rather not.

(EH=1 is a variant of the larx insn, the LL in LL/SC.  This patch is to
make a __atomic_compare_exchange_local which has the same semantics as
__atomic_compare_exchange, just perhaps different performance).


Segher

Reply via email to