[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-27 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh closed 
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-22 Thread Matt Arsenault via cfe-commits


@@ -678,6 +680,49 @@ class SIMemoryLegalizer final : public MachineFunctionPass 
{
   bool runOnMachineFunction(MachineFunction &MF) override;
 };
 
+static const StringMap ASNames = {{
+{"global", SIAtomicAddrSpace::GLOBAL},
+{"local", SIAtomicAddrSpace::LDS},
+}};
+
+void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) {
+  const MachineFunction *MF = MI.getMF();
+  const Function &Fn = MF->getFunction();
+  std::string Str;
+  raw_string_ostream OS(Str);

arsenm wrote:

SmallString + raw_svector_ostream? 

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-22 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-22 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-22 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

> > Then I guess the MMRA should just have "global" and "local" for now, we can 
> > always add more later if needed. What do you think?
> 
> Yes, we don't have specific image counters. They are just vcmnt

Diff has been updated with those changes

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-22 Thread Matt Arsenault via cfe-commits

arsenm wrote:

> Then I guess the MMRA should just have "global" and "local" for now, we can 
> always add more later if needed. What do you think?

Yes, we don't have specific image counters. They are just vcmnt 

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-21 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

> > I thought image memory = private. It's unclear to me, what AS does OpenCL 
> > IMAGE memory map to in our backend? (But otherwise, yes, MMRA should just 
> > have the backend names, the mapping of the OpenCL IMAGE to a backend AS 
> > should be in the device-lib)
> 
> Images are global memory with magical addressing and value interpretation on 
> load/store. There's nothing private about them

So in our case:

- OpenCL Image = Global
- OpenCL Local = LDS
- OpenCL global = Global 
?

Is that correct?

Then I guess the MMRA should just have "global" and "local" for now, we can 
always add more later if needed. What do you think?

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-21 Thread Matt Arsenault via cfe-commits

arsenm wrote:

> I thought image memory = private. It's unclear to me, what AS does OpenCL 
> IMAGE memory map to in our backend? (But otherwise, yes, MMRA should just 
> have the backend names, the mapping of the OpenCL IMAGE to a backend AS 
> should be in the device-lib)

Images are global memory with magical addressing and value interpretation on 
load/store. There's nothing private about them 

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-20 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

> > @arsenm Should we use `image` or `private`? We could allow both in the 
> > frontend, and only use `private` as the canonical MMRA.
> 
> I don't understand why image would imply private. I would just keep at as 
> private throughout

I thought image memory = private. It's unclear to me, what AS does OpenCL IMAGE 
memory map to in our backend?
(But otherwise, yes, MMRA should just have the backend names, the mapping of 
the OpenCL IMAGE to a backend AS should be in the device-lib)

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-20 Thread Matt Arsenault via cfe-commits

arsenm wrote:

> @arsenm Should we use `image` or `private`? We could allow both in the 
> frontend, and only use `private` as the canonical MMRA.

I don't understand why image would imply private. I would just keep at as 
private throughout 

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-17 Thread Pierre van Houtryve via cfe-commits

Pierre-vh wrote:

@arsenm Should we use `image` or `private`?
We could allow both in the frontend, and only use `private` as the canonical 
MMRA.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-14 Thread Sameer Sahasrabuddhe via cfe-commits

https://github.com/ssahasra approved this pull request.

Looks good to me. But I have no opinion about that discussion with whether 
"image" should be available for explicit use!

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-14 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -678,6 +680,50 @@ class SIMemoryLegalizer final : public MachineFunctionPass 
{
   bool runOnMachineFunction(MachineFunction &MF) override;
 };
 
+static const StringMap ASNames = {{
+{"global", SIAtomicAddrSpace::GLOBAL},
+{"local", SIAtomicAddrSpace::LDS},
+{"image", SIAtomicAddrSpace::SCRATCH},
+}};
+
+void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) {
+  const MachineFunction *MF = MI.getMF();
+  const Function &Fn = MF->getFunction();
+  std::string Str;
+  raw_string_ostream OS(Str);
+  OS << "unknown address space '" << AS << "'; expected one of ";
+  ListSeparator LS;
+  for (const auto &[Name, Val] : ASNames)
+OS << LS << '\'' << Name << '\'';
+  DiagnosticInfoUnsupported BadTag(Fn, Str, MI.getDebugLoc(), DS_Warning);
+  Fn.getContext().diagnose(BadTag);
+}
+
+/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA.
+/// If this tag isn't present, or if it has no meaningful values, returns \p
+/// Default. Otherwise returns all the address spaces concerned by the MMRA.
+static SIAtomicAddrSpace getFenceAddrSpaceMMRA(const MachineInstr &MI,
+   SIAtomicAddrSpace Default) {
+  static constexpr StringLiteral FenceASPrefix = "amdgpu-as";
+
+  auto MMRA = MMRAMetadata(MI.getMMRAMetadata());
+  if (!MMRA)
+return Default;
+
+  SIAtomicAddrSpace Result = SIAtomicAddrSpace::NONE;
+  for (const auto &[Prefix, Suffix] : MMRA) {
+if (Prefix != FenceASPrefix)
+  continue;
+
+if (auto It = ASNames.find(Suffix); It != ASNames.end())

ssahasra wrote:

Wow, I have never considered using a semicolon like this before. But it makes 
so much sense! :)

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-14 Thread Sameer Sahasrabuddhe via cfe-commits

https://github.com/ssahasra edited 
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits


@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass 
{
   bool runOnMachineFunction(MachineFunction &MF) override;
 };
 
+static std::array, 3> ASNames = {{
+{"global", SIAtomicAddrSpace::GLOBAL},
+{"local", SIAtomicAddrSpace::LDS},
+{"image", SIAtomicAddrSpace::SCRATCH},
+}};
+
+void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) {
+  const MachineFunction *MF = MI.getMF();
+  const Function &Fn = MF->getFunction();
+  std::string Str;
+  raw_string_ostream OS(Str);
+  OS << "unknown address space '" << AS << "'; expected one of ";
+  ListSeparator LS;
+  for (const auto &[Name, Val] : ASNames)
+OS << LS << '\'' << Name << '\'';
+  DiagnosticInfoUnsupported BadTag(Fn, Str, MI.getDebugLoc(), DS_Warning);
+  Fn.getContext().diagnose(BadTag);
+}
+
+/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA.
+/// If this tag isn't present, or if it has no meaningful values, returns \p
+/// Default. Otherwise returns all the address spaces concerned by the MMRA.
+static SIAtomicAddrSpace getFenceAddrSpaceMMRA(const MachineInstr &MI,
+   SIAtomicAddrSpace Default) {
+  static constexpr StringLiteral FenceASPrefix = "amdgpu-as";
+
+  auto MMRA = MMRAMetadata(MI.getMMRAMetadata());
+  if (!MMRA)
+return Default;
+
+  SIAtomicAddrSpace Result = SIAtomicAddrSpace::NONE;
+  for (const auto &[Prefix, Suffix] : MMRA) {
+if (Prefix != FenceASPrefix)
+  continue;
+
+auto It = find_if(ASNames, [Suffix = Suffix](auto &Pair) {

Pierre-vh wrote:

Lambdas can't capture structured bindings (`auto [a, b] = pair`) in C++17, I 
think it comes in C++20 or 23?

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass 
{
   bool runOnMachineFunction(MachineFunction &MF) override;
 };
 
+static std::array, 3> ASNames = {{
+{"global", SIAtomicAddrSpace::GLOBAL},
+{"local", SIAtomicAddrSpace::LDS},
+{"image", SIAtomicAddrSpace::SCRATCH},
+}};
+
+void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) {
+  const MachineFunction *MF = MI.getMF();
+  const Function &Fn = MF->getFunction();
+  std::string Str;
+  raw_string_ostream OS(Str);
+  OS << "unknown address space '" << AS << "'; expected one of ";
+  ListSeparator LS;
+  for (const auto &[Name, Val] : ASNames)
+OS << LS << '\'' << Name << '\'';
+  DiagnosticInfoUnsupported BadTag(Fn, Str, MI.getDebugLoc(), DS_Warning);
+  Fn.getContext().diagnose(BadTag);
+}
+
+/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA.
+/// If this tag isn't present, or if it has no meaningful values, returns \p
+/// Default. Otherwise returns all the address spaces concerned by the MMRA.
+static SIAtomicAddrSpace getFenceAddrSpaceMMRA(const MachineInstr &MI,
+   SIAtomicAddrSpace Default) {
+  static constexpr StringLiteral FenceASPrefix = "amdgpu-as";
+
+  auto MMRA = MMRAMetadata(MI.getMMRAMetadata());
+  if (!MMRA)
+return Default;
+
+  SIAtomicAddrSpace Result = SIAtomicAddrSpace::NONE;
+  for (const auto &[Prefix, Suffix] : MMRA) {
+if (Prefix != FenceASPrefix)
+  continue;
+
+auto It = find_if(ASNames, [Suffix = Suffix](auto &Pair) {

ssahasra wrote:

I didn't understand the default assignment of Suffix. Shouldn't the lambda 
capture it simply because the outer symbol is used inside? But anyway, this use 
of find_if shouldn't be necessary with StringMap.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass 
{
   bool runOnMachineFunction(MachineFunction &MF) override;
 };
 
+static std::array, 3> ASNames = {{

ssahasra wrote:

Use StringMap for this?

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits


@@ -4408,6 +4409,42 @@ Target-Specific Extensions
 
 Clang supports some language features conditionally on some targets.
 
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE``
+* ``const char *`` synchronization scope, e.g. ``workgroup``
+* Zero or more ``const char *`` address spaces names.
+
+The address spaces arguments must be string literals with known values, such 
as:
+
+* ``"local"``
+* ``"global"``
+* ``"image"``
+
+If one or more address space name are provided, the code generator will attempt
+to emit potentially faster instructions that only fence those address spaces.
+Emitting such instructions may not always be possible and the compiler is free
+to fence more aggressively.
+
+If no address spaces names are provided, all address spaces are fenced.
+
+.. code-block:: c++
+
+  // Fence all address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup");
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent");
+
+  // Fence only requested address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local")

Pierre-vh wrote:

This is just for internal use right now, so we can use anything we want. Device 
libs can then map the IMAGE flag to a "private" operand here.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Matt Arsenault via cfe-commits


@@ -4408,6 +4409,42 @@ Target-Specific Extensions
 
 Clang supports some language features conditionally on some targets.
 
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE``
+* ``const char *`` synchronization scope, e.g. ``workgroup``
+* Zero or more ``const char *`` address spaces names.
+
+The address spaces arguments must be string literals with known values, such 
as:
+
+* ``"local"``
+* ``"global"``
+* ``"image"``
+
+If one or more address space name are provided, the code generator will attempt
+to emit potentially faster instructions that only fence those address spaces.
+Emitting such instructions may not always be possible and the compiler is free
+to fence more aggressively.
+
+If no address spaces names are provided, all address spaces are fenced.
+
+.. code-block:: c++
+
+  // Fence all address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup");
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent");
+
+  // Fence only requested address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local")

arsenm wrote:

Not sure we can get away without image

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits


@@ -4408,6 +4409,42 @@ Target-Specific Extensions
 
 Clang supports some language features conditionally on some targets.
 
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE``
+* ``const char *`` synchronization scope, e.g. ``workgroup``
+* Zero or more ``const char *`` address spaces names.
+
+The address spaces arguments must be string literals with known values, such 
as:
+
+* ``"local"``
+* ``"global"``
+* ``"image"``
+
+If one or more address space name are provided, the code generator will attempt
+to emit potentially faster instructions that only fence those address spaces.
+Emitting such instructions may not always be possible and the compiler is free
+to fence more aggressively.
+
+If no address spaces names are provided, all address spaces are fenced.
+
+.. code-block:: c++
+
+  // Fence all address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup");
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent");
+
+  // Fence only requested address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local")

Pierre-vh wrote:

I wasusing the OpenCL terminology: 
https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/atomic_work_item_fence.html

> flags must be set to CLK_GLOBAL_MEM_FENCE, CLK_LOCAL_MEM_FENCE, 
> CLK_IMAGE_MEM_FENCE 

Thinking about it, maybe it's better to use names consistent with our AMDGPUAS 
definitions, so image would just be "private" but local/global stay. What do 
you think?

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits


@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
   return nullptr;
 }
 
+void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst,
+ const CallExpr *E) {
+  constexpr const char *Tag = "amdgpu-as";

Pierre-vh wrote:

I think it's just more readable this way, but I have no strong preference 
either.
I'd say let's keep it this way, it can very easily be renamed later if we want 
to.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-07 Thread Matt Arsenault via cfe-commits


@@ -4408,6 +4409,42 @@ Target-Specific Extensions
 
 Clang supports some language features conditionally on some targets.
 
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE``
+* ``const char *`` synchronization scope, e.g. ``workgroup``
+* Zero or more ``const char *`` address spaces names.
+
+The address spaces arguments must be string literals with known values, such 
as:
+
+* ``"local"``
+* ``"global"``
+* ``"image"``
+
+If one or more address space name are provided, the code generator will attempt
+to emit potentially faster instructions that only fence those address spaces.
+Emitting such instructions may not always be possible and the compiler is free
+to fence more aggressively.
+
+If no address spaces names are provided, all address spaces are fenced.
+
+.. code-block:: c++
+
+  // Fence all address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup");
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent");
+
+  // Fence only requested address spaces.
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local")

arsenm wrote:

We randomly change between HSA and OpenCL terminology. Maybe we should call 
"local" "groupsegment"? I guess the ISA manuals call it "local data share" 

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-07 Thread Matt Arsenault via cfe-commits


@@ -1,22 +1,113 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 4
 // REQUIRES: amdgpu-registered-target
 // RUN: %clang_cc1 %s -emit-llvm -O0 -o - \
-// RUN:   -triple=amdgcn-amd-amdhsa  | opt -S | FileCheck %s
+// RUN:   -triple=amdgcn-amd-amdhsa | FileCheck %s
 
+// CHECK-LABEL: define dso_local void @_Z25test_memory_fence_successv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:fence syncscope("workgroup") seq_cst
+// CHECK-NEXT:fence syncscope("agent") acquire
+// CHECK-NEXT:fence seq_cst
+// CHECK-NEXT:fence syncscope("agent") acq_rel
+// CHECK-NEXT:fence syncscope("workgroup") release
+// CHECK-NEXT:ret void
+//
 void test_memory_fence_success() {
-  // CHECK-LABEL: test_memory_fence_success
 
-  // CHECK: fence syncscope("workgroup") seq_cst
   __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup");
 
-  // CHECK: fence syncscope("agent") acquire
   __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent");
 
-  // CHECK: fence seq_cst
   __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "");
 
-  // CHECK: fence syncscope("agent") acq_rel
   __builtin_amdgcn_fence(4, "agent");
 
-  // CHECK: fence syncscope("workgroup") release
   __builtin_amdgcn_fence(3, "workgroup");
 }
+
+// CHECK-LABEL: define dso_local void @_Z10test_localv(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META3:![0-9]+]]
+// CHECK-NEXT:fence syncscope("agent") acquire, !mmra [[META3]]
+// CHECK-NEXT:fence seq_cst, !mmra [[META3]]
+// CHECK-NEXT:fence syncscope("agent") acq_rel, !mmra [[META3]]
+// CHECK-NEXT:fence syncscope("workgroup") release, !mmra [[META3]]
+// CHECK-NEXT:ret void
+//
+void test_local() {
+  __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "local");
+
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent", "local");
+
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "", "local");
+
+  __builtin_amdgcn_fence(4, "agent", "local");
+
+  __builtin_amdgcn_fence(3, "workgroup", "local");
+}
+
+
+// CHECK-LABEL: define dso_local void @_Z11test_globalv(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META4:![0-9]+]]
+// CHECK-NEXT:fence syncscope("agent") acquire, !mmra [[META4]]
+// CHECK-NEXT:fence seq_cst, !mmra [[META4]]
+// CHECK-NEXT:fence syncscope("agent") acq_rel, !mmra [[META4]]
+// CHECK-NEXT:fence syncscope("workgroup") release, !mmra [[META4]]
+// CHECK-NEXT:ret void
+//
+void test_global() {
+  __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "global");
+
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent", "global");
+
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "", "global");
+
+  __builtin_amdgcn_fence(4, "agent", "global");
+
+  __builtin_amdgcn_fence(3, "workgroup", "global");
+}
+
+// CHECK-LABEL: define dso_local void @_Z10test_imagev(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META5:![0-9]+]]
+// CHECK-NEXT:fence syncscope("agent") acquire, !mmra [[META5]]
+// CHECK-NEXT:fence seq_cst, !mmra [[META5]]
+// CHECK-NEXT:fence syncscope("agent") acq_rel, !mmra [[META5]]
+// CHECK-NEXT:fence syncscope("workgroup") release, !mmra [[META5]]
+// CHECK-NEXT:ret void
+//
+void test_image() {
+  __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "image");
+
+  __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent", "image");
+
+  __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "", "image");
+
+  __builtin_amdgcn_fence(4, "agent", "image");
+
+  __builtin_amdgcn_fence(3, "workgroup", "image");
+}
+
+// CHECK-LABEL: define dso_local void @_Z10test_mixedv(
+// CHECK-SAME: ) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META6:![0-9]+]]
+// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META7:![0-9]+]]
+// CHECK-NEXT:ret void
+//
+void test_mixed() {
+  __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "image", "global");
+  __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "image", "local", 
"global");
+}

arsenm wrote:

Maybe test repeated AS name 

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-05 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
   return nullptr;
 }
 
+void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst,
+ const CallExpr *E) {
+  constexpr const char *Tag = "amdgpu-as";

ssahasra wrote:

Just bikeshedding a bit, but do we really need the "amdgpu" prefix on the tag? 
Clang will only generate these for AMDGPU anyway. It's not a blocker, but feels 
like we are being cautious for no reason.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-05 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -678,6 +679,59 @@ class SIMemoryLegalizer final : public MachineFunctionPass 
{
   bool runOnMachineFunction(MachineFunction &MF) override;
 };
 
+static std::array, 3> ASNames = {{
+{"global", SIAtomicAddrSpace::GLOBAL},
+{"local", SIAtomicAddrSpace::LDS},
+{"image", SIAtomicAddrSpace::SCRATCH},
+}};
+
+void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) {
+  const MachineFunction *MF = MI.getMF();
+  const Function &Fn = MF->getFunction();
+  std::string Str;
+  raw_string_ostream OS(Str);
+  OS << "unknown address space '" << AS << "'; expected one of ";
+  bool IsFirst = true;

ssahasra wrote:

Use ListSeparator from StringExtras.h

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-05 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
   return nullptr;
 }
 
+void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst,

ssahasra wrote:

The function immediately below this uses "AMDGPU" in its name. I think that's 
the newer practice. I don't have a strong opinion on this, because the same 
file also has functions which say "AMDGCN" instead. I am not sure which way the 
naming convention is leaning.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-05 Thread Sameer Sahasrabuddhe via cfe-commits

https://github.com/ssahasra edited 
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-05 Thread Sameer Sahasrabuddhe via cfe-commits


@@ -4408,6 +4409,42 @@ Target-Specific Extensions
 
 Clang supports some language features conditionally on some targets.
 
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE``
+* ``const char *`` synchronization scope, e.g. ``workgroup``
+* Zero or more ``const char *`` address spaces names.
+
+The address spaces arguments must be string literals with known values, such 
as:
+
+* ``"local"``
+* ``"global"``
+* ``"image"``
+
+If one or more address space name are provided, the code generator will attempt
+to emit potentially faster instructions that only fence those address spaces.

ssahasra wrote:

This use of "fence" as a verb seems a bit too informal. Reword it to say 
"instructions that order access to at least those address spaces"? (Note the 
addition of "at least" to signify a lower bound)

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-05 Thread Sameer Sahasrabuddhe via cfe-commits

https://github.com/ssahasra approved this pull request.

The frontend changes and the MMRA emitted in LLVM IR look good to me. The 
backend changes also look okay, but please see if anyone else has comments 
about that.

https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-03 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh edited 
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-03 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh edited 
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits