[PATCH] D97224: Use Address for CGBuilder's CreateAtomicRMW and CreateAtomicCmpXchg.

John McCall via Phabricator via cfe-commits Thu, 04 Mar 2021 11:28:46 -0800

rjmccall added a comment.

In D97224#2604069 <https://reviews.llvm.org/D97224#2604069>, @jyknight wrote:

> In D97224#2596410 <https://reviews.llvm.org/D97224#2596410>, @rjmccall wrote:
>
>> Do we really consider the existing atomic intrinsics to not impose added 
>> alignment restrictions?  I'm somewhat concerned that we're going to produce 
>> IR here that's either really suboptimal or, worse, unimplementable, just 
>> because we over-interpreted some cue about alignment.  I guess it would only 
>> be a significant problem on a target where types are frequently 
>> under-aligned for what atomics need, which is not typical, or when the user 
>> is doing atomics on a field of something like a packed struct.
>
> For the `__atomic_*` intrinsics, we don't consider those as imposing 
> additional alignment restrictions -- currently, we avoid generating the LLVM 
> IR instruction if it's below natural alignment, since we couldn't specify 
> alignment on the IR instruction. (Now that we have alignment on the LLVM IR 
> operations, I'd like to eventually get rid of that logic from Clang, since 
> it's also handled by LLVM.)

Frontends ultimately have the responsibility of making sure they ultimately 
emit code that follows the platform ABI for atomics.  In most other parts of 
the ABI, we usually find that it is possible (even necessary) to delegate 
*part* of that ABI responsibility down to LLVM — e.g. to emit inline atomic 
sequences, which I suppose frontends could do with inline assembly, but there 
are obvious reasons to prefer a more semantic IR — but that at least in some 
cases, there is information that we cannot pass down and so must handle in the 
frontend.  I am somewhat skeptical that atomics are going to prove an exception 
here.  At the very least, there will always be *some* operations that we have 
to expand into compare-exchange loops in the frontend, for want of a 
sufficiently powerful instruction/intrinsic.  That said, if you find that you 
can actually free Clang from having to make certain decisions here, that's 
great; I just want you to understand that usually we find that there are limits 
to what we can usefully delegate to LLVM, and ultimately the responsibility is 
ours.

> For other intrinsics (e.g. MSVCIntrin::_InterlockedAnd, 
> Builtin::BI__sync_fetch_and_add_4, NVPTX::BI__nvvm_atom_add_gen_i, and the 
> others in those 3 function families), we currently entirely ignore the 
> alignment, and simply assume the argument is naturally-aligned. So, yes, this 
> change could potentially affect the behavior for underaligned types.
>
> So, I could change these to explicitly increase the assumed alignment of 
> their arguments, like I did for InterlockedCompareExchange128. My inclination 
> is not to do so, however...it doesn't seem like a good idea in general to 
> ignore type alignment. But, I'd not be opposed to doing that, if there's a 
> good reason.

The idea here is not to "ignore type alignment".  `EmitPointerWithAlignment` 
will sometimes return an alignment for a pointer that's less than the alignment 
of the pointee type, e.g. because you're taking the address of a packed struct 
field.  The critical question is whether the atomic builtins ought to make an 
effort to honor that reduced alignment, even if it leads to terrible code, or 
if we should treat the use of the builtin as a user promise that the pointer is 
actually more aligned than you might think from the information statically 
available.  That has to be resolved by the semantics of the builtin, and 
unfortunately intrinsic documentation is often spotty on this point.  For 
example, the MSDN documentation for `InterlockedIncrement` says that it 
requires 32-bit alignment, but the documentation for `InterlockedAdd` doesn't.  
It seems extremely unlikely that that is meant to be read as a statement that 
`InterlockedAdd` is actually more permissive.  I would say that all of the 
Interlocked APIs ought to be read as guaranteeing the natural, full-width 
alignment for their operation.  I'm less certain about what to do with the 
`__atomic_*` builtins, because maybe we should take this as an opportunity to 
try to "do the right thing" with under-aligned atomics; on the other hand, that 
assumes that we always *can* "do the right thing", and I don't want Clang to 
start miscompiling or refusing to compile code because we're trying to do 
something above and beyond the normal language semantics.  It might be more 
reasonable to always use the type alignment as a minimum.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97224/new/

https://reviews.llvm.org/D97224

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D97224: Use Address for CGBuilder's CreateAtomicRMW and CreateAtomicCmpXchg.

Reply via email to