[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/Pierre-vh closed https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -678,6 +680,49 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction &MF) override; }; +static const StringMap ASNames = {{ +{"global", SIAtomicAddrSpace::GLOBAL}, +{"local", SIAtomicAddrSpace::LDS}, +}}; + +void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) { + const MachineFunction *MF = MI.getMF(); + const Function &Fn = MF->getFunction(); + std::string Str; + raw_string_ostream OS(Str); arsenm wrote: SmallString + raw_svector_ostream? https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
Pierre-vh wrote: > > Then I guess the MMRA should just have "global" and "local" for now, we can > > always add more later if needed. What do you think? > > Yes, we don't have specific image counters. They are just vcmnt Diff has been updated with those changes https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
arsenm wrote: > Then I guess the MMRA should just have "global" and "local" for now, we can > always add more later if needed. What do you think? Yes, we don't have specific image counters. They are just vcmnt https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
Pierre-vh wrote: > > I thought image memory = private. It's unclear to me, what AS does OpenCL > > IMAGE memory map to in our backend? (But otherwise, yes, MMRA should just > > have the backend names, the mapping of the OpenCL IMAGE to a backend AS > > should be in the device-lib) > > Images are global memory with magical addressing and value interpretation on > load/store. There's nothing private about them So in our case: - OpenCL Image = Global - OpenCL Local = LDS - OpenCL global = Global ? Is that correct? Then I guess the MMRA should just have "global" and "local" for now, we can always add more later if needed. What do you think? https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
arsenm wrote: > I thought image memory = private. It's unclear to me, what AS does OpenCL > IMAGE memory map to in our backend? (But otherwise, yes, MMRA should just > have the backend names, the mapping of the OpenCL IMAGE to a backend AS > should be in the device-lib) Images are global memory with magical addressing and value interpretation on load/store. There's nothing private about them https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
Pierre-vh wrote: > > @arsenm Should we use `image` or `private`? We could allow both in the > > frontend, and only use `private` as the canonical MMRA. > > I don't understand why image would imply private. I would just keep at as > private throughout I thought image memory = private. It's unclear to me, what AS does OpenCL IMAGE memory map to in our backend? (But otherwise, yes, MMRA should just have the backend names, the mapping of the OpenCL IMAGE to a backend AS should be in the device-lib) https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
arsenm wrote: > @arsenm Should we use `image` or `private`? We could allow both in the > frontend, and only use `private` as the canonical MMRA. I don't understand why image would imply private. I would just keep at as private throughout https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
Pierre-vh wrote: @arsenm Should we use `image` or `private`? We could allow both in the frontend, and only use `private` as the canonical MMRA. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/ssahasra approved this pull request. Looks good to me. But I have no opinion about that discussion with whether "image" should be available for explicit use! https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -678,6 +680,50 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction &MF) override; }; +static const StringMap ASNames = {{ +{"global", SIAtomicAddrSpace::GLOBAL}, +{"local", SIAtomicAddrSpace::LDS}, +{"image", SIAtomicAddrSpace::SCRATCH}, +}}; + +void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) { + const MachineFunction *MF = MI.getMF(); + const Function &Fn = MF->getFunction(); + std::string Str; + raw_string_ostream OS(Str); + OS << "unknown address space '" << AS << "'; expected one of "; + ListSeparator LS; + for (const auto &[Name, Val] : ASNames) +OS << LS << '\'' << Name << '\''; + DiagnosticInfoUnsupported BadTag(Fn, Str, MI.getDebugLoc(), DS_Warning); + Fn.getContext().diagnose(BadTag); +} + +/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA. +/// If this tag isn't present, or if it has no meaningful values, returns \p +/// Default. Otherwise returns all the address spaces concerned by the MMRA. +static SIAtomicAddrSpace getFenceAddrSpaceMMRA(const MachineInstr &MI, + SIAtomicAddrSpace Default) { + static constexpr StringLiteral FenceASPrefix = "amdgpu-as"; + + auto MMRA = MMRAMetadata(MI.getMMRAMetadata()); + if (!MMRA) +return Default; + + SIAtomicAddrSpace Result = SIAtomicAddrSpace::NONE; + for (const auto &[Prefix, Suffix] : MMRA) { +if (Prefix != FenceASPrefix) + continue; + +if (auto It = ASNames.find(Suffix); It != ASNames.end()) ssahasra wrote: Wow, I have never considered using a semicolon like this before. But it makes so much sense! :) https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/ssahasra edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction &MF) override; }; +static std::array, 3> ASNames = {{ +{"global", SIAtomicAddrSpace::GLOBAL}, +{"local", SIAtomicAddrSpace::LDS}, +{"image", SIAtomicAddrSpace::SCRATCH}, +}}; + +void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) { + const MachineFunction *MF = MI.getMF(); + const Function &Fn = MF->getFunction(); + std::string Str; + raw_string_ostream OS(Str); + OS << "unknown address space '" << AS << "'; expected one of "; + ListSeparator LS; + for (const auto &[Name, Val] : ASNames) +OS << LS << '\'' << Name << '\''; + DiagnosticInfoUnsupported BadTag(Fn, Str, MI.getDebugLoc(), DS_Warning); + Fn.getContext().diagnose(BadTag); +} + +/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA. +/// If this tag isn't present, or if it has no meaningful values, returns \p +/// Default. Otherwise returns all the address spaces concerned by the MMRA. +static SIAtomicAddrSpace getFenceAddrSpaceMMRA(const MachineInstr &MI, + SIAtomicAddrSpace Default) { + static constexpr StringLiteral FenceASPrefix = "amdgpu-as"; + + auto MMRA = MMRAMetadata(MI.getMMRAMetadata()); + if (!MMRA) +return Default; + + SIAtomicAddrSpace Result = SIAtomicAddrSpace::NONE; + for (const auto &[Prefix, Suffix] : MMRA) { +if (Prefix != FenceASPrefix) + continue; + +auto It = find_if(ASNames, [Suffix = Suffix](auto &Pair) { Pierre-vh wrote: Lambdas can't capture structured bindings (`auto [a, b] = pair`) in C++17, I think it comes in C++20 or 23? https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction &MF) override; }; +static std::array, 3> ASNames = {{ +{"global", SIAtomicAddrSpace::GLOBAL}, +{"local", SIAtomicAddrSpace::LDS}, +{"image", SIAtomicAddrSpace::SCRATCH}, +}}; + +void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) { + const MachineFunction *MF = MI.getMF(); + const Function &Fn = MF->getFunction(); + std::string Str; + raw_string_ostream OS(Str); + OS << "unknown address space '" << AS << "'; expected one of "; + ListSeparator LS; + for (const auto &[Name, Val] : ASNames) +OS << LS << '\'' << Name << '\''; + DiagnosticInfoUnsupported BadTag(Fn, Str, MI.getDebugLoc(), DS_Warning); + Fn.getContext().diagnose(BadTag); +} + +/// Reads \p MI's MMRAs to parse the "amdgpu-as" MMRA. +/// If this tag isn't present, or if it has no meaningful values, returns \p +/// Default. Otherwise returns all the address spaces concerned by the MMRA. +static SIAtomicAddrSpace getFenceAddrSpaceMMRA(const MachineInstr &MI, + SIAtomicAddrSpace Default) { + static constexpr StringLiteral FenceASPrefix = "amdgpu-as"; + + auto MMRA = MMRAMetadata(MI.getMMRAMetadata()); + if (!MMRA) +return Default; + + SIAtomicAddrSpace Result = SIAtomicAddrSpace::NONE; + for (const auto &[Prefix, Suffix] : MMRA) { +if (Prefix != FenceASPrefix) + continue; + +auto It = find_if(ASNames, [Suffix = Suffix](auto &Pair) { ssahasra wrote: I didn't understand the default assignment of Suffix. Shouldn't the lambda capture it simply because the outer symbol is used inside? But anyway, this use of find_if shouldn't be necessary with StringMap. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction &MF) override; }; +static std::array, 3> ASNames = {{ ssahasra wrote: Use StringMap for this? https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE`` +* ``const char *`` synchronization scope, e.g. ``workgroup`` +* Zero or more ``const char *`` address spaces names. + +The address spaces arguments must be string literals with known values, such as: + +* ``"local"`` +* ``"global"`` +* ``"image"`` + +If one or more address space name are provided, the code generator will attempt +to emit potentially faster instructions that only fence those address spaces. +Emitting such instructions may not always be possible and the compiler is free +to fence more aggressively. + +If no address spaces names are provided, all address spaces are fenced. + +.. code-block:: c++ + + // Fence all address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup"); + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent"); + + // Fence only requested address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") Pierre-vh wrote: This is just for internal use right now, so we can use anything we want. Device libs can then map the IMAGE flag to a "private" operand here. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE`` +* ``const char *`` synchronization scope, e.g. ``workgroup`` +* Zero or more ``const char *`` address spaces names. + +The address spaces arguments must be string literals with known values, such as: + +* ``"local"`` +* ``"global"`` +* ``"image"`` + +If one or more address space name are provided, the code generator will attempt +to emit potentially faster instructions that only fence those address spaces. +Emitting such instructions may not always be possible and the compiler is free +to fence more aggressively. + +If no address spaces names are provided, all address spaces are fenced. + +.. code-block:: c++ + + // Fence all address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup"); + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent"); + + // Fence only requested address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") arsenm wrote: Not sure we can get away without image https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE`` +* ``const char *`` synchronization scope, e.g. ``workgroup`` +* Zero or more ``const char *`` address spaces names. + +The address spaces arguments must be string literals with known values, such as: + +* ``"local"`` +* ``"global"`` +* ``"image"`` + +If one or more address space name are provided, the code generator will attempt +to emit potentially faster instructions that only fence those address spaces. +Emitting such instructions may not always be possible and the compiler is free +to fence more aggressively. + +If no address spaces names are provided, all address spaces are fenced. + +.. code-block:: c++ + + // Fence all address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup"); + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent"); + + // Fence only requested address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") Pierre-vh wrote: I wasusing the OpenCL terminology: https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/atomic_work_item_fence.html > flags must be set to CLK_GLOBAL_MEM_FENCE, CLK_LOCAL_MEM_FENCE, > CLK_IMAGE_MEM_FENCE Thinking about it, maybe it's better to use names consistent with our AMDGPUAS definitions, so image would just be "private" but local/global stay. What do you think? https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID, return nullptr; } +void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst, + const CallExpr *E) { + constexpr const char *Tag = "amdgpu-as"; Pierre-vh wrote: I think it's just more readable this way, but I have no strong preference either. I'd say let's keep it this way, it can very easily be renamed later if we want to. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE`` +* ``const char *`` synchronization scope, e.g. ``workgroup`` +* Zero or more ``const char *`` address spaces names. + +The address spaces arguments must be string literals with known values, such as: + +* ``"local"`` +* ``"global"`` +* ``"image"`` + +If one or more address space name are provided, the code generator will attempt +to emit potentially faster instructions that only fence those address spaces. +Emitting such instructions may not always be possible and the compiler is free +to fence more aggressively. + +If no address spaces names are provided, all address spaces are fenced. + +.. code-block:: c++ + + // Fence all address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup"); + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent"); + + // Fence only requested address spaces. + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local") arsenm wrote: We randomly change between HSA and OpenCL terminology. Maybe we should call "local" "groupsegment"? I guess the ISA manuals call it "local data share" https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -1,22 +1,113 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 4 // REQUIRES: amdgpu-registered-target // RUN: %clang_cc1 %s -emit-llvm -O0 -o - \ -// RUN: -triple=amdgcn-amd-amdhsa | opt -S | FileCheck %s +// RUN: -triple=amdgcn-amd-amdhsa | FileCheck %s +// CHECK-LABEL: define dso_local void @_Z25test_memory_fence_successv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:fence syncscope("workgroup") seq_cst +// CHECK-NEXT:fence syncscope("agent") acquire +// CHECK-NEXT:fence seq_cst +// CHECK-NEXT:fence syncscope("agent") acq_rel +// CHECK-NEXT:fence syncscope("workgroup") release +// CHECK-NEXT:ret void +// void test_memory_fence_success() { - // CHECK-LABEL: test_memory_fence_success - // CHECK: fence syncscope("workgroup") seq_cst __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup"); - // CHECK: fence syncscope("agent") acquire __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent"); - // CHECK: fence seq_cst __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, ""); - // CHECK: fence syncscope("agent") acq_rel __builtin_amdgcn_fence(4, "agent"); - // CHECK: fence syncscope("workgroup") release __builtin_amdgcn_fence(3, "workgroup"); } + +// CHECK-LABEL: define dso_local void @_Z10test_localv( +// CHECK-SAME: ) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META3:![0-9]+]] +// CHECK-NEXT:fence syncscope("agent") acquire, !mmra [[META3]] +// CHECK-NEXT:fence seq_cst, !mmra [[META3]] +// CHECK-NEXT:fence syncscope("agent") acq_rel, !mmra [[META3]] +// CHECK-NEXT:fence syncscope("workgroup") release, !mmra [[META3]] +// CHECK-NEXT:ret void +// +void test_local() { + __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "local"); + + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent", "local"); + + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "", "local"); + + __builtin_amdgcn_fence(4, "agent", "local"); + + __builtin_amdgcn_fence(3, "workgroup", "local"); +} + + +// CHECK-LABEL: define dso_local void @_Z11test_globalv( +// CHECK-SAME: ) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META4:![0-9]+]] +// CHECK-NEXT:fence syncscope("agent") acquire, !mmra [[META4]] +// CHECK-NEXT:fence seq_cst, !mmra [[META4]] +// CHECK-NEXT:fence syncscope("agent") acq_rel, !mmra [[META4]] +// CHECK-NEXT:fence syncscope("workgroup") release, !mmra [[META4]] +// CHECK-NEXT:ret void +// +void test_global() { + __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "global"); + + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent", "global"); + + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "", "global"); + + __builtin_amdgcn_fence(4, "agent", "global"); + + __builtin_amdgcn_fence(3, "workgroup", "global"); +} + +// CHECK-LABEL: define dso_local void @_Z10test_imagev( +// CHECK-SAME: ) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META5:![0-9]+]] +// CHECK-NEXT:fence syncscope("agent") acquire, !mmra [[META5]] +// CHECK-NEXT:fence seq_cst, !mmra [[META5]] +// CHECK-NEXT:fence syncscope("agent") acq_rel, !mmra [[META5]] +// CHECK-NEXT:fence syncscope("workgroup") release, !mmra [[META5]] +// CHECK-NEXT:ret void +// +void test_image() { + __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "image"); + + __builtin_amdgcn_fence(__ATOMIC_ACQUIRE, "agent", "image"); + + __builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "", "image"); + + __builtin_amdgcn_fence(4, "agent", "image"); + + __builtin_amdgcn_fence(3, "workgroup", "image"); +} + +// CHECK-LABEL: define dso_local void @_Z10test_mixedv( +// CHECK-SAME: ) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META6:![0-9]+]] +// CHECK-NEXT:fence syncscope("workgroup") seq_cst, !mmra [[META7:![0-9]+]] +// CHECK-NEXT:ret void +// +void test_mixed() { + __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "image", "global"); + __builtin_amdgcn_fence( __ATOMIC_SEQ_CST, "workgroup", "image", "local", "global"); +} arsenm wrote: Maybe test repeated AS name https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID, return nullptr; } +void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst, + const CallExpr *E) { + constexpr const char *Tag = "amdgpu-as"; ssahasra wrote: Just bikeshedding a bit, but do we really need the "amdgpu" prefix on the tag? Clang will only generate these for AMDGPU anyway. It's not a blocker, but feels like we are being cautious for no reason. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -678,6 +679,59 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction &MF) override; }; +static std::array, 3> ASNames = {{ +{"global", SIAtomicAddrSpace::GLOBAL}, +{"local", SIAtomicAddrSpace::LDS}, +{"image", SIAtomicAddrSpace::SCRATCH}, +}}; + +void diagnoseUnknownMMRAASName(const MachineInstr &MI, StringRef AS) { + const MachineFunction *MF = MI.getMF(); + const Function &Fn = MF->getFunction(); + std::string Str; + raw_string_ostream OS(Str); + OS << "unknown address space '" << AS << "'; expected one of "; + bool IsFirst = true; ssahasra wrote: Use ListSeparator from StringExtras.h https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID, return nullptr; } +void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst, ssahasra wrote: The function immediately below this uses "AMDGPU" in its name. I think that's the newer practice. I don't have a strong opinion on this, because the same file also has functions which say "AMDGCN" instead. I am not sure which way the naming convention is leaning. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/ssahasra edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +* ``unsigned`` atomic ordering, e.g. ``__ATOMIC_ACQUIRE`` +* ``const char *`` synchronization scope, e.g. ``workgroup`` +* Zero or more ``const char *`` address spaces names. + +The address spaces arguments must be string literals with known values, such as: + +* ``"local"`` +* ``"global"`` +* ``"image"`` + +If one or more address space name are provided, the code generator will attempt +to emit potentially faster instructions that only fence those address spaces. ssahasra wrote: This use of "fence" as a verb seems a bit too informal. Reword it to say "instructions that order access to at least those address spaces"? (Note the addition of "at least" to signify a lower bound) https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/ssahasra approved this pull request. The frontend changes and the MMRA emitted in LLVM IR look good to me. The backend changes also look okay, but please see if anyone else has comments about that. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits