https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82400
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> --- (In reply to Tom de Vries from comment #0) > The scheme is the same for all operators, using the compare-and-swap atomic > ptx instruction (atom.cas). > > However, some of the operators are supported natively for the ptx: > ... > .op = { .and, .or, .xor, .cas, .exch, .add, .inc, .dec, .min, .max }; > ... > so for f.i. addition we could use an atom.add instead. [ As implied by the nvptx_reduction_update todo: optimize for atomic ops and independent complex ops. ]