On 9/3/25 15:23, Jakub Jelinek wrote:
External email: Use caution opening links or attachments
Does that sound reasonable?
No, I don't really get what is so hard to pattern match this (obviously you
want to pattern match it after inlining, not before).
I think we have far more complicated pattern matchers in gcc already.
int
foo (int *p, int i)
{
int o = __atomic_load_n (p, __ATOMIC_RELAXED);
int q;
do
q = o < i ? o : i;
while (__atomic_compare_exchange_n (p, &o, q, 1, __ATOMIC_RELAXED,
__ATOMIC_RELAXED));
return o;
}
That is roughly
_1 = __atomic_load_4 (p_6(D), 0);
_2 = (int) _1;
<bb 3> [local count: 1073741824]:
# o_16 = PHI <_2(2), _15(3)>
q_9 = MIN_EXPR <i_8(D), o_16>;
q.1_3 = (unsigned int) q_9;
_11 = (unsigned int) o_16;
_12 = .ATOMIC_COMPARE_EXCHANGE (p_6(D), _11, q.1_3, 260, 0, 0);
_13 = IMAGPART_EXPR <_12>;
_14 = REALPART_EXPR <_12>;
_15 = (int) _14;
if (_13 != 0)
goto <bb 3>; [99.96%]
else
goto <bb 4>; [0.04%]
obviously for unsigned types it will be without the casts in there, so it
needs to be a little bit flexible, but not so much (allow casts to the same
precision, for floating point perhaps VCEs). And sure, it should look at
the memory model flags and figure out if the planned replacement is
compatible with those.
Jakub
Ok -- TBH I don't have any extra details on this argument right now and
your point on it's feasibility seems quite convincing. (Timezones are
slowing communication with other compiler team about why it's hard in
their case -- I may get more information later).
The other arguments towards a builtin I know of are less of a
requirement and more about helping keep code standardised throughout the
ecosystem:
- The design of the atomic builtins was to match the requirements for
C++11. It would seem natural to me to keep matching the C++ standard as
it evolves. (In this case also providing users writing C with a
standard interface to use this functionality).
- Specifically for the fetch_min/fetch_max, the paper that proposed it
be standardised discussed the forms of CAS loop that might be written
and how the semantics of two of them are subtly different (section 5 of
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p0493r5.pdf).
If we don't provide a user-interface then each user writing C would have
to deal with the subtleties of atomic synchronisation vs aggressive
optimisations themselves increasing possibility of some mistakes being made.
MM