Pierre-vh wrote:
@arsenm Should we use `image` or `private`?
We could allow both in the frontend, and only use `private` as the canonical
MMRA.
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
@@ -678,6 +680,54 @@ class SIMemoryLegalizer final : public MachineFunctionPass
{
bool runOnMachineFunction(MachineFunction ) override;
};
+static std::array, 3> ASNames = {{
+{"global", SIAtomicAddrSpace::GLOBAL},
+{"local", SIAtomicAddrSpace::LDS},
+{"image",
@@ -4408,6 +4409,42 @@ Target-Specific Extensions
Clang supports some language features conditionally on some targets.
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+*
@@ -4408,6 +4409,42 @@ Target-Specific Extensions
Clang supports some language features conditionally on some targets.
+AMDGPU Language Extensions
+--
+
+__builtin_amdgcn_fence
+^^
+
+``__builtin_amdgcn_fence`` emits a fence.
+
+*
@@ -18365,6 +18366,28 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned
BuiltinID,
return nullptr;
}
+void CodeGenFunction::AddAMDGCNFenceAddressSpaceMMRA(llvm::Instruction *Inst,
+ const CallExpr *E) {
+ constexpr
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Pierre-vh wrote:
I changed it so it's one or more string arguments:
```
__builtin_amdgcn_masked_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global")
```
I'm now wondering if adding a new builtin is needed at all, or if it should
just be part of the original builtin? It's an additive change.
@@ -69,6 +69,7 @@ BUILTIN(__builtin_amdgcn_iglp_opt, "vIi", "n")
BUILTIN(__builtin_amdgcn_s_dcache_inv, "v", "n")
BUILTIN(__builtin_amdgcn_buffer_wbinvl1, "v", "n")
BUILTIN(__builtin_amdgcn_fence, "vUicC*", "n")
+BUILTIN(__builtin_amdgcn_masked_fence, "vUiUicC*", "n")
@@ -18319,6 +18320,26 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned
BuiltinID,
return nullptr;
}
+void CodeGenFunction::AddAMDGCNAddressSpaceMMRA(llvm::Instruction *Inst,
+llvm::Value *ASMask) {
+ constexpr const
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/78572
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -18319,6 +18320,26 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned
BuiltinID,
return nullptr;
}
+void CodeGenFunction::AddAMDGCNAddressSpaceMMRA(llvm::Instruction *Inst,
+llvm::Value *ASMask) {
+ constexpr const
https://github.com/Pierre-vh closed
https://github.com/llvm/llvm-project/pull/83558
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Pierre-vh wrote:
> This was the original behavior of my patch, but I reverted it because it
> broke all the HIP headers that were unintentionally relying on this. Has that
> been resolved?
Was an issue opened for that? How many headers are affected?
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/83558
>From 3730631ac58425f559f4bc3cfe3da89e6367c1c5 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Fri, 1 Mar 2024 12:43:55 +0100
Subject: [PATCH 1/2] [clang][AMDGPU] Don't define feature macros on host code
Those
https://github.com/Pierre-vh created
https://github.com/llvm/llvm-project/pull/83558
Those macros are unreliable because our features are mostly uninitialized at
that stage, so any macro we define is unreliable.
Fixes SWDEV-447308
>From 3730631ac58425f559f4bc3cfe3da89e6367c1c5 Mon Sep 17
@@ -2326,6 +2326,20 @@ bool
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction ,
}
#endif
+if (ST->isPreciseMemoryEnabled()) {
+ AMDGPU::Waitcnt Wait;
+ if (WCG == )
+Wait = AMDGPU::Waitcnt(0, 0, 0, 0);
Pierre-vh wrote:
I was
@@ -2326,6 +2326,20 @@ bool
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction ,
}
#endif
+if (ST->isPreciseMemoryEnabled()) {
+ AMDGPU::Waitcnt Wait;
+ if (WCG == )
Pierre-vh wrote:
Use `ST->hasExtendedWaitCounts()` instead of
@@ -2594,12 +2594,10 @@ bool SIMemoryLegalizer::expandAtomicCmpxchgOrRmw(const
SIMemOpInfo ,
MOI.getOrdering() == AtomicOrdering::SequentiallyConsistent ||
MOI.getFailureOrdering() == AtomicOrdering::Acquire ||
MOI.getFailureOrdering() ==
https://github.com/Pierre-vh approved this pull request.
LGTM, but wait for @t-tye or @jayfoad to approve as well
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
Pierre-vh wrote:
> Thanks for the comments @arsenm @yxsamliu @b-sumner.
>
> By approaching a similar solution, do you mean MMRAs (#78569) ?
>
> If so, should I rebase/adapt my patch to the MMRA PR? Or will this PR be
> redundant and needs closing?
>
> @yxsamliu These concise names look good
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
Pierre-vh wrote:
Just remove `m_amdgpu_Features_Group` from your option's `SimpleMFlag`, follow
the same pattern as
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
Pierre-vh wrote:
It's only called once per run by the driver, yes
We already do this for wavefrontsize64, and pretty much
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
Pierre-vh wrote:
The extra overhead is just 3 lines in `clang/lib/Driver/ToolChains/AMDGPU.cpp`,
it's negligible.
We
@@ -355,6 +356,18 @@ class SICacheControl {
MachineBasicBlock::iterator ) const {
return false;
}
+
+public:
+ // The following is for supporting precise memory mode. When the option
+ // amdgpu-precise-memory is enabled, an s_waitcnt
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Pierre-vh requested changes to this pull request.
Did you try to move this to SIInsertWaitCnt, as suggested?
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
@@ -2378,6 +2456,221 @@ bool
SIGfx12CacheControl::enableVolatileAndOrNonTemporal(
return Changed;
}
+bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory(
+MachineBasicBlock::iterator ) {
+ assert(MI->mayLoadOrStore());
+
+ MachineInstr = *MI;
+
@@ -2378,6 +2456,221 @@ bool
SIGfx12CacheControl::enableVolatileAndOrNonTemporal(
return Changed;
}
+bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory(
+MachineBasicBlock::iterator ) {
+ assert(MI->mayLoadOrStore());
+
+ MachineInstr = *MI;
+
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
Pierre-vh wrote:
I think you just need to add something like this in `AMDGPU.cpp` in
`getAMDGPUTargetFeatures`
```
if
https://github.com/Pierre-vh commented:
I also agree with Jay, can't this go in InsertWaitCnt? Why does it have to go
in SIMemoryLegalizer instead?
If it has to stay here, fine, but is it possible to merge some code with
SIInsertWaitCnt in a common helper somewhere?
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Pierre-vh closed
https://github.com/llvm/llvm-project/pull/81718
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Pierre-vh wrote:
> > Sorry, I should have clearly mentioned that. Yes, it is for my followup
> > change #80908. In #80908, we changed the type of LLVM builtin but kept the
> > corresponding clang builtin unchanged to avoid breaking existing uses.
>
> Don't see how that could be related; you
https://github.com/Pierre-vh created
https://github.com/llvm/llvm-project/pull/81718
The dot is too confusing for tools. Output temporaries would have
'10.3-generic' so tools could parse it as an extension, device libs & the
associated clang driver logic are also confused by the dot.
After
https://github.com/Pierre-vh closed
https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Pierre-vh wrote:
> mad_mix
I added run lines to `mad-mix.ll` and it behaves as expected: no fma/mad_mix
emitted
https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
@@ -0,0 +1,698 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,698 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
https://github.com/Pierre-vh commented:
My comments are mostly about style, I haven't done a deep dive into the logic
of the pass yet
https://github.com/llvm/llvm-project/pull/81058
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/81058
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Pierre-vh wrote:
@t-tye Can you please approve then? Otherwise the diff still shows a red
"Changes requested" warning :) Thanks
@arsenm Please also approve if there are no more comments
https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76955
>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/6] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets
These
@@ -520,6 +520,102 @@ Every processor supports every OS ABI (see
:ref:`amdgpu-os`) with the following
=== === = =
=== === ==
+Generic processors allow execution of a
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76955
>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/5] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets
These
Pierre-vh wrote:
For the MD changes, it's just to describe the version increment, nothing else.
I think describing is important as the V6 diff already updated the
amdhsa.version.
If amdhsa.version didn't need to change then i need to fix that first, and then
we can remove the V6 MD section
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76955
>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/4] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets
These
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76955
>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/3] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets
These
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76955
>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH 1/2] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets
These
Pierre-vh wrote:
@arsenm do you have any concerns with this change?
@t-tye is the documentation good?
https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
bool IsNonTemporal) const override;
};
+class SIPreciseMemorySupport {
+protected:
+ const GCNSubtarget
+ const SIInstrInfo *TII = nullptr;
+
+ IsaVersion
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
bool IsNonTemporal) const override;
};
+class SIPreciseMemorySupport {
+protected:
+ const GCNSubtarget
+ const SIInstrInfo *TII = nullptr;
+
+ IsaVersion
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
Pierre-vh wrote:
Understood :)
Can you remove the `amdgpu` prefix from the option? All target features are
already
@@ -0,0 +1,362 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op < %s
| FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op < %s
| FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
bool IsNonTemporal) const override;
};
+class SIPreciseMemorySupport {
Pierre-vh wrote:
Why does it need to be a separate class hierarchy?
https://github.com/Pierre-vh requested changes to this pull request.
When you made changes, you can click the "Re-request review" icon next to
reviewers to put it back in the review queues :)
https://github.com/llvm/llvm-project/pull/79236
___
https://github.com/Pierre-vh edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76955
>From 616dda8bc9e000e4243ddb8f6b7f4b04f956a620 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:48:05 +0100
Subject: [PATCH] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets
These generic
https://github.com/Pierre-vh closed
https://github.com/llvm/llvm-project/pull/76954
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76954
>From a967fdae9a8557331d2a228f391f39f9e27e8943 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH 1/5] [AMDGPU] Introduce Code Object V6
Introduce Code Object V6 in
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76954
>From a967fdae9a8557331d2a228f391f39f9e27e8943 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH 1/4] [AMDGPU] Introduce Code Object V6
Introduce Code Object V6 in
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76954
>From a967fdae9a8557331d2a228f391f39f9e27e8943 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH 1/3] [AMDGPU] Introduce Code Object V6
Introduce Code Object V6 in
@@ -139,10 +139,10 @@ bool
AMDGPURemoveIncompatibleFunctions::checkFunction(Function ) {
const GCNSubtarget *ST =
static_cast(TM->getSubtargetImpl(F));
- // Check the GPU isn't generic. Generic is used for testing only
- // and we don't want this pass to interfere
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76954
>From 7ad88453f5e89fd4643afa486e52a123138433f4 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH 1/2] [AMDGPU] Introduce Code Object V6
Introduce Code Object V6 in
@@ -44,8 +44,15 @@ constexpr uint32_t VersionMajorV5 = 1;
/// HSA metadata minor version for code object V5.
constexpr uint32_t VersionMinorV5 = 2;
+/// HSA metadata major version for code object V6.
+constexpr uint32_t VersionMajorV6 = 1;
+/// HSA metadata minor version for
@@ -620,6 +620,15 @@ void ScalarBitSetTraits::bitset(IO ,
BCase(EF_AMDGPU_FEATURE_XNACK_V3);
BCase(EF_AMDGPU_FEATURE_SRAMECC_V3);
break;
+case ELF::ELFABIVERSION_AMDGPU_HSA_V6:
Pierre-vh wrote:
elf-headers.test already covers it
@@ -44,8 +44,15 @@ constexpr uint32_t VersionMajorV5 = 1;
/// HSA metadata minor version for code object V5.
constexpr uint32_t VersionMinorV5 = 2;
+/// HSA metadata major version for code object V6.
+constexpr uint32_t VersionMajorV6 = 1;
+/// HSA metadata minor version for
@@ -17,13 +17,16 @@
#include "AMDGPUMachineModuleInfo.h"
#include "GCNSubtarget.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
+#include "Utils/AMDGPUBaseInfo.h"
#include "llvm/ADT/BitmaskEnum.h"
#include "llvm/CodeGen/MachineBasicBlock.h"
#include
@@ -0,0 +1,199 @@
+; Testing the -amdgpu-precise-memory-op option
Pierre-vh wrote:
Please generate the test using `update_llc_test_checks`, much easier to update
if/when things change.
Also I think you don't need `-verify-machineinstrs`. It's expensive and
@@ -2561,6 +2567,70 @@ bool SIMemoryLegalizer::expandAtomicCmpxchgOrRmw(const
SIMemOpInfo ,
return Changed;
}
+bool SIMemoryLegalizer::GFX9InsertWaitcntForPreciseMem(MachineFunction ) {
+ const GCNSubtarget = MF.getSubtarget();
+ const SIInstrInfo *TII =
@@ -0,0 +1,199 @@
+; Testing the -amdgpu-precise-memory-op option
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op
@@ -641,6 +644,9 @@ class SIMemoryLegalizer final : public MachineFunctionPass {
bool expandAtomicCmpxchgOrRmw(const SIMemOpInfo ,
MachineBasicBlock::iterator );
+ bool GFX9InsertWaitcntForPreciseMem(MachineFunction );
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
Pierre-vh wrote:
I'm not a fan of using a feature for this, I think we should have a backend CL
option instead. After all
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76954
>From b5a034bd71d6925ac287a9bf4c0f86f9e70bb9d1 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH] [AMDGPU] Introduce Code Object V6
Introduce Code Object V6 in
Pierre-vh wrote:
ping
https://github.com/llvm/llvm-project/pull/76954
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Pierre-vh wrote:
I added a few more tests, I just didn't find how to test the flat-scratch stuff
properly.
Also, gfx904 is documented as not having absolute flat scratch, yet I don't see
anything about that in the code (no related feature). I put gfx9-generic with
flat scratch but I don't
@@ -4135,6 +4283,33 @@ Code object V5 metadata is the same as
== == =
+.. _amdgpu-amdhsa-code-object-metadata-v6:
+
+Code Object V6 Metadata
+
+.. warning::
+ Code object
@@ -520,6 +520,106 @@ Every processor supports every OS ABI (see
:ref:`amdgpu-os`) with the following
=== === = =
=== === ==
+Generic processors also exist. They group
@@ -253,6 +274,12 @@ AMDGPU::IsaVersion AMDGPU::getIsaVersion(StringRef GPU) {
case GK_GFX1151: return {11, 5, 1};
case GK_GFX1200: return {12, 0, 0};
case GK_GFX1201: return {12, 0, 1};
+
+ // Generic targets use the earliest ISA version in their group.
@@ -787,11 +788,15 @@ enum : unsigned {
EF_AMDGPU_MACH_AMDGCN_GFX942= 0x04c,
EF_AMDGPU_MACH_AMDGCN_RESERVED_0X4D = 0x04d,
EF_AMDGPU_MACH_AMDGCN_GFX1201 = 0x04e,
+ EF_AMDGPU_MACH_AMDGCN_GFX9_GENERIC = 0x04f,
+ EF_AMDGPU_MACH_AMDGCN_GFX10_1_GENERIC =
https://github.com/Pierre-vh updated
https://github.com/llvm/llvm-project/pull/76954
>From 47d4f3ed4e27f2ce2b3b33c9b0ca4838b3011f22 Mon Sep 17 00:00:00 2001
From: pvanhout
Date: Thu, 4 Jan 2024 14:12:00 +0100
Subject: [PATCH] [AMDGPU] Introduce Code Object V6
Introduce Code Object V6 in
1 - 100 of 134 matches
Mail list logo