Issue 173993
Summary [NVPTX] orderings of atomicrmw instructions are silently discarded
Labels new issue
Assignees
Reporter kulst
    I was trying to understand [this](https://github.com/rust-lang/rust/issues/136480) Rust issue. A simplified version in llvm

```llvm
target datalayout = "e-p6:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"

@COUNTER = internal global [4 x i8] zeroinitializer, align 4

define noundef i32 @atomic_fetch_add() unnamed_addr #1 {
start:
  %0 = atomicrmw add ptr @COUNTER, i32 1 acquire, align 4
  ret i32 %0
}

attributes #0 = { nofree norecurse noreturn nosync nounwind memory(none) "target-cpu"="sm_120" "target-features"="+ptx87" }
```
compiled with ```llc --mcpu=sm_120 --mattr=+ptx87``` produces
```ptx
.version 8.7
.target sm_120
.address_size 64

	// .globl	atomic_fetch_add        // -- Begin function atomic_fetch_add
.global .align 4 .b8 COUNTER[4];
 // @atomic_fetch_add
.visible .func  (.param .b32 func_retval0) atomic_fetch_add()
{
	.reg .b32 	%r<2>;

// %bb.0: // %start
	atom.global.add.u32 	%r1, [COUNTER], 1;
	st.param.b32 	[func_retval0], %r1;
	ret;
 // -- End function
}
```
[[godbolt]](https://godbolt.org/z/Thrqvcz8f)
The `acquire` ordering is silently discarded even if the [PTX ISA version and the target CPU would support it](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=atom#parallel-synchronization-and-communication-instructions-atom).

I assume this is somewhat an artifact as older GPUs and PTX versions (prior to `sm_70` and `ptx60`) did not support orderings. For Volta+ proper support for atomic load and atomic store [was added not long ago](https://github.com/llvm/llvm-project/pull/99709). 
I was able to follow the `NVPTX` source code. Compared to `ISD::ATOMIC_LOAD` and `ISD::ATOMIC_STORE`, `ISD::ATOMIC_LOAD_xxx` [is not specially handled](https://github.com/llvm/llvm-project/blob/faf140ae1d0fbd9035f69916078d41bd3674d618/llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp#L99), and is [lowered without ordering](https://github.com/llvm/llvm-project/blob/faf140ae1d0fbd9035f69916078d41bd3674d618/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td#L2422).

However, shouldn't this be an error or at least a warning?
I was also not able to find this documented somewhere, which I think it should (at least by this issue).
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to