[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-17 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: @arsenm Should we use `image` or `private`? We could allow both in the frontend, and only use `private` as the canonical MMRA. https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-14 Thread Pierre van Houtryve via cfe-commits
@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool runOnMachineFunction(MachineFunction ) override; }; +static std::array, 3> ASNames = {{ +{"global", SIAtomicAddrSpace::GLOBAL}, +{"local", SIAtomicAddrSpace::LDS}, +{"image",

[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +*

[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits
@@ -4408,6 +4409,42 @@ Target-Specific Extensions Clang supports some language features conditionally on some targets. +AMDGPU Language Extensions +-- + +__builtin_amdgcn_fence +^^ + +``__builtin_amdgcn_fence`` emits a fence. + +*

[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-13 Thread Pierre van Houtryve via cfe-commits
@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID, return nullptr; } +void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst, + const CallExpr *E) { + constexpr

[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-03 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add amdgpu-as MMRA for fences (PR #78572)

2024-05-03 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add OpenCL-specific fence address space masks (PR #78572)

2024-05-02 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: I changed it so it's one or more string arguments: ``` __builtin_amdgcn_masked_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global") ``` I'm now wondering if adding a new builtin is needed at all, or if it should just be part of the original builtin? It's an additive change.

[clang] [llvm] [AMDGPU] Add OpenCL-specific fence address space masks (PR #78572)

2024-04-26 Thread Pierre van Houtryve via cfe-commits
@@ -69,6 +69,7 @@ BUILTIN(__builtin_amdgcn_iglp_opt, "vIi", "n") BUILTIN(__builtin_amdgcn_s_dcache_inv, "v", "n") BUILTIN(__builtin_amdgcn_buffer_wbinvl1, "v", "n") BUILTIN(__builtin_amdgcn_fence, "vUicC*", "n") +BUILTIN(__builtin_amdgcn_masked_fence, "vUiUicC*", "n")

[clang] [llvm] [AMDGPU] Add OpenCL-specific fence address space masks (PR #78572)

2024-04-26 Thread Pierre van Houtryve via cfe-commits
@@ -18319,6 +18320,26 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID, return nullptr; } +void CodeGenFunction::AddAMDGCNAddressSpaceMMRA(llvm::Instruction *Inst, +llvm::Value *ASMask) { + constexpr const

[clang] [llvm] [AMDGPU] Add OpenCL-specific fence address space masks (PR #78572)

2024-04-24 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add OpenCL-specific fence address space masks (PR #78572)

2024-04-24 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/78572 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [RFC][AMDGPU] Add OpenCL-specific fence address space masks (PR #78572)

2024-04-24 Thread Pierre van Houtryve via cfe-commits
@@ -18319,6 +18320,26 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID, return nullptr; } +void CodeGenFunction::AddAMDGCNAddressSpaceMMRA(llvm::Instruction *Inst, +llvm::Value *ASMask) { + constexpr const

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-03 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh closed https://github.com/llvm/llvm-project/pull/83558 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: > This was the original behavior of my patch, but I reverted it because it > broke all the HIP headers that were unintentionally relying on this. Has that > been resolved? Was an issue opened for that? How many headers are affected?

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/83558 >From 3730631ac58425f559f4bc3cfe3da89e6367c1c5 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Fri, 1 Mar 2024 12:43:55 +0100 Subject: [PATCH 1/2] [clang][AMDGPU] Don't define feature macros on host code Those

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh created https://github.com/llvm/llvm-project/pull/83558 Those macros are unreliable because our features are mostly uninitialized at that stage, so any macro we define is unreliable. Fixes SWDEV-447308 >From 3730631ac58425f559f4bc3cfe3da89e6367c1c5 Mon Sep 17

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-01 Thread Pierre van Houtryve via cfe-commits
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction , } #endif +if (ST->isPreciseMemoryEnabled()) { + AMDGPU::Waitcnt Wait; + if (WCG == ) +Wait = AMDGPU::Waitcnt(0, 0, 0, 0); Pierre-vh wrote: I was

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-01 Thread Pierre van Houtryve via cfe-commits
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction , } #endif +if (ST->isPreciseMemoryEnabled()) { + AMDGPU::Waitcnt Wait; + if (WCG == ) Pierre-vh wrote: Use `ST->hasExtendedWaitCounts()` instead of

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-01 Thread Pierre van Houtryve via cfe-commits
@@ -2594,12 +2594,10 @@ bool SIMemoryLegalizer::expandAtomicCmpxchgOrRmw(const SIMemOpInfo , MOI.getOrdering() == AtomicOrdering::SequentiallyConsistent || MOI.getFailureOrdering() == AtomicOrdering::Acquire || MOI.getFailureOrdering() ==

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-26 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh approved this pull request. LGTM, but wait for @t-tye or @jayfoad to approve as well https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU] Add an option to disable unsafe uses of atomic xor (PR #69229)

2024-02-20 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: > Thanks for the comments @arsenm @yxsamliu @b-sumner. > > By approaching a similar solution, do you mean MMRAs (#78569) ? > > If so, should I rebase/adapt my patch to the MMRA PR? Or will this PR be > redundant and needs closing? > > @yxsamliu These concise names look good

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-20 Thread Pierre van Houtryve via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory Pierre-vh wrote: Just remove `m_amdgpu_Features_Group` from your option's `SimpleMFlag`, follow the same pattern as

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-19 Thread Pierre van Houtryve via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory Pierre-vh wrote: It's only called once per run by the driver, yes We already do this for wavefrontsize64, and pretty much

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-18 Thread Pierre van Houtryve via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory Pierre-vh wrote: The extra overhead is just 3 lines in `clang/lib/Driver/ToolChains/AMDGPU.cpp`, it's negligible. We

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-18 Thread Pierre van Houtryve via cfe-commits
@@ -355,6 +356,18 @@ class SICacheControl { MachineBasicBlock::iterator ) const { return false; } + +public: + // The following is for supporting precise memory mode. When the option + // amdgpu-precise-memory is enabled, an s_waitcnt

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-18 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-18 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh requested changes to this pull request. Did you try to move this to SIInsertWaitCnt, as suggested? https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
@@ -2378,6 +2456,221 @@ bool SIGfx12CacheControl::enableVolatileAndOrNonTemporal( return Changed; } +bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory( +MachineBasicBlock::iterator ) { + assert(MI->mayLoadOrStore()); + + MachineInstr = *MI; +

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
@@ -2378,6 +2456,221 @@ bool SIGfx12CacheControl::enableVolatileAndOrNonTemporal( return Changed; } +bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory( +MachineBasicBlock::iterator ) { + assert(MI->mayLoadOrStore()); + + MachineInstr = *MI; +

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory Pierre-vh wrote: I think you just need to add something like this in `AMDGPU.cpp` in `getAMDGPUTargetFeatures` ``` if

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh commented: I also agree with Jay, can't this go in InsertWaitCnt? Why does it have to go in SIMemoryLegalizer instead? If it has to stay here, fine, but is it possible to merge some code with SIInsertWaitCnt in a common helper somewhere?

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Replace '.' with '-' in generic target names (PR #81718)

2024-02-14 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh closed https://github.com/llvm/llvm-project/pull/81718 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][CodeGen] Loose the cast check when emitting builtins (PR #81669)

2024-02-14 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: > > Sorry, I should have clearly mentioned that. Yes, it is for my followup > > change #80908. In #80908, we changed the type of LLVM builtin but kept the > > corresponding clang builtin unchanged to avoid breaking existing uses. > > Don't see how that could be related; you

[clang] [llvm] [AMDGPU] Replace '.' with '-' in generic target names (PR #81718)

2024-02-14 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh created https://github.com/llvm/llvm-project/pull/81718 The dot is too confusing for tools. Output temporaries would have '10.3-generic' so tools could parse it as an extension, device libs & the associated clang driver logic are also confused by the dot. After

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-12 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh closed https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-09 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: > mad_mix I added run lines to `mad-mix.ll` and it behaves as expected: no fma/mad_mix emitted https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,701 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh commented: My comments are mostly about style, I haven't done a deep dive into the logic of the pass yet https://github.com/llvm/llvm-project/pull/81058 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-08 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/81058 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: @t-tye Can you please approve then? Otherwise the diff still shows a red "Changes requested" warning :) Thanks @arsenm Please also approve if there are no more comments https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/6] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits
@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following === === = = === === == +Generic processors allow execution of a

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/5] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-07 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: For the MD changes, it's just to describe the version increment, nothing else. I think describing is important as the V6 diff already updated the amdhsa.version. If amdhsa.version didn't need to change then i need to fix that first, and then we can remove the V6 MD section

[llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/4] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These

[llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/3] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH 1/2] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-06 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: @arsenm do you have any concerns with this change? @t-tye is the documentation good? https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { bool IsNonTemporal) const override; }; +class SIPreciseMemorySupport { +protected: + const GCNSubtarget + const SIInstrInfo *TII = nullptr; + + IsaVersion

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { bool IsNonTemporal) const override; }; +class SIPreciseMemorySupport { +protected: + const GCNSubtarget + const SIInstrInfo *TII = nullptr; + + IsaVersion

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory Pierre-vh wrote: Understood :) Can you remove the `amdgpu` prefix from the option? All target features are already

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,362 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { bool IsNonTemporal) const override; }; +class SIPreciseMemorySupport { Pierre-vh wrote: Why does it need to be a separate class hierarchy?

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh requested changes to this pull request. When you made changes, you can click the "Re-request review" icon next to reviewers to put it back in the review queues :) https://github.com/llvm/llvm-project/pull/79236 ___

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-05 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-04 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76955 >From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:48:05 +0100 Subject: [PATCH] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets These generic

[flang] [clang] [llvm] [lld] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-02-04 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh closed https://github.com/llvm/llvm-project/pull/76954 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [lld] [llvm] [flang] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-02-02 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76954 >From a967fdae9a8557331d2a228f391f39f9e27e8943 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:12:00 +0100 Subject: [PATCH 1/5] [AMDGPU] Introduce Code Object V6 Introduce Code Object V6 in

[llvm] [clang] [lld] [flang] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-02-02 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76954 >From a967fdae9a8557331d2a228f391f39f9e27e8943 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:12:00 +0100 Subject: [PATCH 1/4] [AMDGPU] Introduce Code Object V6 Introduce Code Object V6 in

[llvm] [flang] [clang] [lld] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-02-01 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76954 >From a967fdae9a8557331d2a228f391f39f9e27e8943 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:12:00 +0100 Subject: [PATCH 1/3] [AMDGPU] Introduce Code Object V6 Introduce Code Object V6 in

[clang] [flang] [llvm] [lld] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-01 Thread Pierre van Houtryve via cfe-commits
@@ -139,10 +139,10 @@ bool AMDGPURemoveIncompatibleFunctions::checkFunction(Function ) { const GCNSubtarget *ST = static_cast(TM->getSubtargetImpl(F)); - // Check the GPU isn't generic. Generic is used for testing only - // and we don't want this pass to interfere

[clang] [lld] [llvm] [flang] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-02-01 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76954 >From 7ad88453f5e89fd4643afa486e52a123138433f4 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:12:00 +0100 Subject: [PATCH 1/2] [AMDGPU] Introduce Code Object V6 Introduce Code Object V6 in

[lld] [flang] [llvm] [clang] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-30 Thread Pierre van Houtryve via cfe-commits
@@ -44,8 +44,15 @@ constexpr uint32_t VersionMajorV5 = 1; /// HSA metadata minor version for code object V5. constexpr uint32_t VersionMinorV5 = 2; +/// HSA metadata major version for code object V6. +constexpr uint32_t VersionMajorV6 = 1; +/// HSA metadata minor version for

[lld] [flang] [clang] [llvm] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -620,6 +620,15 @@ void ScalarBitSetTraits::bitset(IO , BCase(EF_AMDGPU_FEATURE_XNACK_V3); BCase(EF_AMDGPU_FEATURE_SRAMECC_V3); break; +case ELF::ELFABIVERSION_AMDGPU_HSA_V6: Pierre-vh wrote: elf-headers.test already covers it

[lld] [flang] [clang] [llvm] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -44,8 +44,15 @@ constexpr uint32_t VersionMajorV5 = 1; /// HSA metadata minor version for code object V5. constexpr uint32_t VersionMinorV5 = 2; +/// HSA metadata major version for code object V6. +constexpr uint32_t VersionMajorV6 = 1; +/// HSA metadata minor version for

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -17,13 +17,16 @@ #include "AMDGPUMachineModuleInfo.h" #include "GCNSubtarget.h" #include "MCTargetDesc/AMDGPUMCTargetDesc.h" +#include "Utils/AMDGPUBaseInfo.h" #include "llvm/ADT/BitmaskEnum.h" #include "llvm/CodeGen/MachineBasicBlock.h" #include

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,199 @@ +; Testing the -amdgpu-precise-memory-op option Pierre-vh wrote: Please generate the test using `update_llc_test_checks`, much easier to update if/when things change. Also I think you don't need `-verify-machineinstrs`. It's expensive and

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -2561,6 +2567,70 @@ bool SIMemoryLegalizer::expandAtomicCmpxchgOrRmw(const SIMemOpInfo , return Changed; } +bool SIMemoryLegalizer::GFX9InsertWaitcntForPreciseMem(MachineFunction ) { + const GCNSubtarget = MF.getSubtarget(); + const SIInstrInfo *TII =

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -0,0 +1,199 @@ +; Testing the -amdgpu-precise-memory-op option +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op -verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -641,6 +644,9 @@ class SIMemoryLegalizer final : public MachineFunctionPass { bool expandAtomicCmpxchgOrRmw(const SIMemOpInfo , MachineBasicBlock::iterator ); + bool GFX9InsertWaitcntForPreciseMem(MachineFunction );

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-01-28 Thread Pierre van Houtryve via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory Pierre-vh wrote: I'm not a fan of using a feature for this, I think we should have a backend CL option instead. After all

[flang] [clang] [lld] [llvm] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-23 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76954 >From b5a034bd71d6925ac287a9bf4c0f86f9e70bb9d1 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:12:00 +0100 Subject: [PATCH] [AMDGPU] Introduce Code Object V6 Introduce Code Object V6 in

[lld] [llvm] [clang] [flang] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-22 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: ping https://github.com/llvm/llvm-project/pull/76954 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [lld] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits
Pierre-vh wrote: I added a few more tests, I just didn't find how to test the flat-scratch stuff properly. Also, gfx904 is documented as not having absolute flat scratch, yet I don't see anything about that in the code (no related feature). I put gfx9-generic with flat scratch but I don't

[clang] [flang] [llvm] [lld] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits
@@ -4135,6 +4283,33 @@ Code object V5 metadata is the same as == == = +.. _amdgpu-amdhsa-code-object-metadata-v6: + +Code Object V6 Metadata + +.. warning:: + Code object

[clang] [lld] [llvm] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits
@@ -520,6 +520,106 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following === === = = === === == +Generic processors also exist. They group

[clang] [lld] [llvm] [flang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits
@@ -253,6 +274,12 @@ AMDGPU::IsaVersion AMDGPU::getIsaVersion(StringRef GPU) { case GK_GFX1151: return {11, 5, 1}; case GK_GFX1200: return {12, 0, 0}; case GK_GFX1201: return {12, 0, 1}; + + // Generic targets use the earliest ISA version in their group.

[clang] [flang] [lld] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-18 Thread Pierre van Houtryve via cfe-commits
@@ -787,11 +788,15 @@ enum : unsigned { EF_AMDGPU_MACH_AMDGCN_GFX942= 0x04c, EF_AMDGPU_MACH_AMDGCN_RESERVED_0X4D = 0x04d, EF_AMDGPU_MACH_AMDGCN_GFX1201 = 0x04e, + EF_AMDGPU_MACH_AMDGCN_GFX9_GENERIC = 0x04f, + EF_AMDGPU_MACH_AMDGCN_GFX10_1_GENERIC =

[clang] [flang] [lld] [llvm] [AMDGPU] Introduce Code Object V6 (PR #76954)

2024-01-18 Thread Pierre van Houtryve via cfe-commits
https://github.com/Pierre-vh updated https://github.com/llvm/llvm-project/pull/76954 >From 47d4f3ed4e27f2ce2b3b33c9b0ca4838b3011f22 Mon Sep 17 00:00:00 2001 From: pvanhout Date: Thu, 4 Jan 2024 14:12:00 +0100 Subject: [PATCH] [AMDGPU] Introduce Code Object V6 Introduce Code Object V6 in

  1   2   >