| Issue |
173993
|
| Summary |
[NVPTX] orderings of atomicrmw instructions are silently discarded
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
kulst
|
I was trying to understand [this](https://github.com/rust-lang/rust/issues/136480) Rust issue. A simplified version in llvm
```llvm
target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
@COUNTER = internal global [4 x i8] zeroinitializer, align 4
define noundef i32 @atomic_fetch_add() unnamed_addr #1 {
start:
%0 = atomicrmw add ptr @COUNTER, i32 1 acquire, align 4
ret i32 %0
}
attributes #0 = { nofree norecurse noreturn nosync nounwind memory(none) "target-cpu"="sm_120" "target-features"="+ptx87" }
```
compiled with ```llc --mcpu=sm_120 --mattr=+ptx87``` produces
```ptx
.version 8.7
.target sm_120
.address_size 64
// .globl atomic_fetch_add // -- Begin function atomic_fetch_add
.global .align 4 .b8 COUNTER[4];
// @atomic_fetch_add
.visible .func (.param .b32 func_retval0) atomic_fetch_add()
{
.reg .b32 %r<2>;
// %bb.0: // %start
atom.global.add.u32 %r1, [COUNTER], 1;
st.param.b32 [func_retval0], %r1;
ret;
// -- End function
}
```
[[godbolt]](https://godbolt.org/z/Thrqvcz8f)
The `acquire` ordering is silently discarded even if the [PTX ISA version and the target CPU would support it](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=atom#parallel-synchronization-and-communication-instructions-atom).
I assume this is somewhat an artifact as older GPUs and PTX versions (prior to `sm_70` and `ptx60`) did not support orderings. For Volta+ proper support for atomic load and atomic store [was added not long ago](https://github.com/llvm/llvm-project/pull/99709).
I was able to follow the `NVPTX` source code. Compared to `ISD::ATOMIC_LOAD` and `ISD::ATOMIC_STORE`, `ISD::ATOMIC_LOAD_xxx` [is not specially handled](https://github.com/llvm/llvm-project/blob/faf140ae1d0fbd9035f69916078d41bd3674d618/llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp#L99), and is [lowered without ordering](https://github.com/llvm/llvm-project/blob/faf140ae1d0fbd9035f69916078d41bd3674d618/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L2422).
However, shouldn't this be an error or at least a warning?
I was also not able to find this documented somewhere, which I think it should (at least by this issue).
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs