[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/111072 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/112136 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
https://github.com/DianQK created
https://github.com/llvm/llvm-project/pull/112136
Backport #111945.
(cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c)
>From f8cab50362e2b4f4523818e844fca0c622339985 Mon Sep 17 00:00:00 2001
From: Fangrui Song
Date: Fri, 11 Oct 2024 08:47:07 -0700
Subject: [PATCH] [ELF] Make shouldAddProvideSym return values consistent when
demoted to Undefined
Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate
sections that would be discarded by GC.
Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return
false (when Defined) and then return true (been demoted to Undefined).
```
addScriptReferencedSymbolsToSymTable
shouldAddProvideSym(f1): false
// the RHS (bar) is not added to `referencedSymbols` and may be GCed
declareSymbols
shouldAddProvideSym(f1): false
markLive
demoteSymbolsAndComputeIsPreemptible
// demoted f1 to Undefined
processSymbolAssignments
addSymbol
shouldAddProvideSym(f1): true
```
The inconsistency can cause `cmd->expression()` in `addSymbol` to be
evaluated, leading to `symbol not found: bar` errors (since `bar` in the
RHS is not in `referencedSymbols` and is GCed) (#111478).
Fix this by adding a `sym->isUsedInRegularObj` condition, making
`shouldAddProvideSym(f1)` values consistent. In addition, we need a
`sym->exportDynamic` condition to keep provide-shared.s working.
Fixes: ebb326a51fec37b5a47e5702e8ea157cd4f835cd
Pull Request: https://github.com/llvm/llvm-project/pull/111945
(cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c)
---
lld/ELF/LinkerScript.cpp| 9 +-
lld/test/ELF/linkerscript/provide-defined.s | 36 +
2 files changed, 44 insertions(+), 1 deletion(-)
create mode 100644 lld/test/ELF/linkerscript/provide-defined.s
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index 055fa21d44ca6e..d95c5573935ec4 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1718,6 +1718,13 @@ void
LinkerScript::addScriptReferencedSymbolsToSymTable() {
}
bool LinkerScript::shouldAddProvideSym(StringRef symName) {
+ // This function is called before and after garbage collection. To prevent
+ // undefined references from the RHS, the result of this function for a
+ // symbol must be the same for each call. We use isUsedInRegularObj to not
+ // change the return value of a demoted symbol. The exportDynamic condition,
+ // while not so accurate, allows PROVIDE to define a symbol referenced by a
+ // DSO.
Symbol *sym = symtab.find(symName);
- return sym && !sym->isDefined() && !sym->isCommon();
+ return sym && !sym->isDefined() && !sym->isCommon() &&
+ (sym->isUsedInRegularObj || sym->exportDynamic);
}
diff --git a/lld/test/ELF/linkerscript/provide-defined.s
b/lld/test/ELF/linkerscript/provide-defined.s
new file mode 100644
index 00..1d44bef3d4068d
--- /dev/null
+++ b/lld/test/ELF/linkerscript/provide-defined.s
@@ -0,0 +1,36 @@
+# REQUIRES: x86
+## Test the GC behavior when the PROVIDE symbol is defined by a relocatable
file.
+
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o
+# RUN: ld.lld -T a.t --gc-sections a.o b.o -o a
+# RUN: llvm-readelf -s a | FileCheck %s
+
+# CHECK: 1: {{.*}} 0 NOTYPE GLOBAL DEFAULT 1 _start
+# CHECK-NEXT:2: {{.*}} 0 NOTYPE GLOBAL DEFAULT 2 f3
+# CHECK-NOT: {{.}}
+
+#--- a.s
+.global _start, f1, f2, f3, bar
+_start:
+ call f3
+
+.section .text.f1,"ax"; f1:
+.section .text.f2,"ax"; f2: # referenced by another relocatable file
+.section .text.f3,"ax"; f3: # live
+.section .text.bar,"ax"; bar:
+
+.comm comm,4,4
+
+#--- b.s
+ call f2
+
+#--- a.t
+SECTIONS {
+ . = . + SIZEOF_HEADERS;
+ PROVIDE(f1 = bar+1);
+ PROVIDE(f2 = bar+2);
+ PROVIDE(f3 = bar+3);
+ PROVIDE(f4 = comm+4);
+}
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
llvmbot wrote:
@llvm/pr-subscribers-lld
Author: DianQK (DianQK)
Changes
Backport #111945.
(cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c)
---
Full diff: https://github.com/llvm/llvm-project/pull/112136.diff
2 Files Affected:
- (modified) lld/ELF/LinkerScript.cpp (+8-1)
- (added) lld/test/ELF/linkerscript/provide-defined.s (+36)
``diff
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index 055fa21d44ca6e..d95c5573935ec4 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1718,6 +1718,13 @@ void
LinkerScript::addScriptReferencedSymbolsToSymTable() {
}
bool LinkerScript::shouldAddProvideSym(StringRef symName) {
+ // This function is called before and after garbage collection. To prevent
+ // undefined references from the RHS, the result of this function for a
+ // symbol must be the same for each call. We use isUsedInRegularObj to not
+ // change the return value of a demoted symbol. The exportDynamic condition,
+ // while not so accurate, allows PROVIDE to define a symbol referenced by a
+ // DSO.
Symbol *sym = symtab.find(symName);
- return sym && !sym->isDefined() && !sym->isCommon();
+ return sym && !sym->isDefined() && !sym->isCommon() &&
+ (sym->isUsedInRegularObj || sym->exportDynamic);
}
diff --git a/lld/test/ELF/linkerscript/provide-defined.s
b/lld/test/ELF/linkerscript/provide-defined.s
new file mode 100644
index 00..1d44bef3d4068d
--- /dev/null
+++ b/lld/test/ELF/linkerscript/provide-defined.s
@@ -0,0 +1,36 @@
+# REQUIRES: x86
+## Test the GC behavior when the PROVIDE symbol is defined by a relocatable
file.
+
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o
+# RUN: ld.lld -T a.t --gc-sections a.o b.o -o a
+# RUN: llvm-readelf -s a | FileCheck %s
+
+# CHECK: 1: {{.*}} 0 NOTYPE GLOBAL DEFAULT 1 _start
+# CHECK-NEXT:2: {{.*}} 0 NOTYPE GLOBAL DEFAULT 2 f3
+# CHECK-NOT: {{.}}
+
+#--- a.s
+.global _start, f1, f2, f3, bar
+_start:
+ call f3
+
+.section .text.f1,"ax"; f1:
+.section .text.f2,"ax"; f2: # referenced by another relocatable file
+.section .text.f3,"ax"; f3: # live
+.section .text.bar,"ax"; bar:
+
+.comm comm,4,4
+
+#--- b.s
+ call f2
+
+#--- a.t
+SECTIONS {
+ . = . + SIZEOF_HEADERS;
+ PROVIDE(f1 = bar+1);
+ PROVIDE(f2 = bar+2);
+ PROVIDE(f3 = bar+3);
+ PROVIDE(f4 = comm+4);
+}
``
https://github.com/llvm/llvm-project/pull/112136
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
https://github.com/DianQK milestoned https://github.com/llvm/llvm-project/pull/112136 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)
https://github.com/maksfb approved this pull request. LGTM. Please add the description of the problem this PR fixes and link any related issue(s). https://github.com/llvm/llvm-project/pull/111072 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM][CodeGen] Port LiveRegMatrix to NPM (PR #109938)
https://github.com/optimisan updated
https://github.com/llvm/llvm-project/pull/109938
>From d4cc049c53df27919103625417730595fc2183d7 Mon Sep 17 00:00:00 2001
From: Akshat Oke
Date: Tue, 24 Sep 2024 09:07:04 +
Subject: [PATCH 1/4] [NewPM][CodeGen] Port LiveRegMatrix to NPM
---
llvm/include/llvm/CodeGen/LiveRegMatrix.h | 50 ---
llvm/include/llvm/InitializePasses.h | 2 +-
.../llvm/Passes/MachinePassRegistry.def | 4 +-
llvm/lib/CodeGen/LiveRegMatrix.cpp| 38 ++
llvm/lib/CodeGen/RegAllocBasic.cpp| 8 +--
llvm/lib/CodeGen/RegAllocGreedy.cpp | 8 +--
llvm/lib/Passes/PassBuilder.cpp | 1 +
llvm/lib/Target/AMDGPU/GCNNSAReassign.cpp | 6 +--
.../Target/AMDGPU/SIPreAllocateWWMRegs.cpp| 6 +--
9 files changed, 88 insertions(+), 35 deletions(-)
diff --git a/llvm/include/llvm/CodeGen/LiveRegMatrix.h
b/llvm/include/llvm/CodeGen/LiveRegMatrix.h
index 2b32308c7c075e..c024ca9c1dc38d 100644
--- a/llvm/include/llvm/CodeGen/LiveRegMatrix.h
+++ b/llvm/include/llvm/CodeGen/LiveRegMatrix.h
@@ -37,7 +37,9 @@ class MachineFunction;
class TargetRegisterInfo;
class VirtRegMap;
-class LiveRegMatrix : public MachineFunctionPass {
+class LiveRegMatrix {
+ friend class LiveRegMatrixWrapperPass;
+ friend class LiveRegMatrixAnalysis;
const TargetRegisterInfo *TRI = nullptr;
LiveIntervals *LIS = nullptr;
VirtRegMap *VRM = nullptr;
@@ -57,15 +59,21 @@ class LiveRegMatrix : public MachineFunctionPass {
unsigned RegMaskVirtReg = 0;
BitVector RegMaskUsable;
- // MachineFunctionPass boilerplate.
- void getAnalysisUsage(AnalysisUsage &) const override;
- bool runOnMachineFunction(MachineFunction &) override;
- void releaseMemory() override;
+ LiveRegMatrix() = default;
+ void releaseMemory();
public:
- static char ID;
-
- LiveRegMatrix();
+ LiveRegMatrix(LiveRegMatrix &&Other)
+ : TRI(Other.TRI), LIS(Other.LIS), VRM(Other.VRM), UserTag(Other.UserTag),
+Matrix(std::move(Other.Matrix)), Queries(std::move(Other.Queries)),
+RegMaskTag(Other.RegMaskTag), RegMaskVirtReg(Other.RegMaskVirtReg),
+RegMaskUsable(std::move(Other.RegMaskUsable)) {
+Other.TRI = nullptr;
+Other.LIS = nullptr;
+Other.VRM = nullptr;
+ }
+
+ void init(MachineFunction &MF, LiveIntervals *LIS, VirtRegMap *VRM);
//======//
// High-level interface.
@@ -159,6 +167,32 @@ class LiveRegMatrix : public MachineFunctionPass {
Register getOneVReg(unsigned PhysReg) const;
};
+class LiveRegMatrixWrapperPass : public MachineFunctionPass {
+ LiveRegMatrix LRM;
+
+public:
+ static char ID;
+
+ LiveRegMatrixWrapperPass() : MachineFunctionPass(ID) {}
+
+ LiveRegMatrix &getLRM() { return LRM; }
+ const LiveRegMatrix &getLRM() const { return LRM; }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+ bool runOnMachineFunction(MachineFunction &MF) override;
+ void releaseMemory() override;
+};
+
+class LiveRegMatrixAnalysis : public AnalysisInfoMixin {
+ friend AnalysisInfoMixin;
+ static AnalysisKey Key;
+
+public:
+ using Result = LiveRegMatrix;
+
+ LiveRegMatrix run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
} // end namespace llvm
#endif // LLVM_CODEGEN_LIVEREGMATRIX_H
diff --git a/llvm/include/llvm/InitializePasses.h
b/llvm/include/llvm/InitializePasses.h
index d89a5538b46975..3fee8c40a6607e 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -156,7 +156,7 @@ void initializeLiveDebugValuesPass(PassRegistry &);
void initializeLiveDebugVariablesPass(PassRegistry &);
void initializeLiveIntervalsWrapperPassPass(PassRegistry &);
void initializeLiveRangeShrinkPass(PassRegistry &);
-void initializeLiveRegMatrixPass(PassRegistry &);
+void initializeLiveRegMatrixWrapperPassPass(PassRegistry &);
void initializeLiveStacksPass(PassRegistry &);
void initializeLiveVariablesWrapperPassPass(PassRegistry &);
void initializeLoadStoreOptPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index bdc56ca03f392a..4497c1fce0db69 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -97,6 +97,7 @@ LOOP_PASS("loop-term-fold", LoopTermFoldPass())
// preferably fix the scavenger to not depend on them).
MACHINE_FUNCTION_ANALYSIS("live-intervals", LiveIntervalsAnalysis())
MACHINE_FUNCTION_ANALYSIS("live-vars", LiveVariablesAnalysis())
+MACHINE_FUNCTION_ANALYSIS("live-reg-matrix", LiveRegMatrixAnalysis())
MACHINE_FUNCTION_ANALYSIS("machine-block-freq",
MachineBlockFrequencyAnalysis())
MACHINE_FUNCTION_ANALYSIS("machine-branch-prob",
MachineBranchProbabilityAnalysis())
@@ -122,8 +123,7 @@ MACHINE_FUNCTION_ANALYSIS("virtregmap",
VirtRegMapAnalysis())
// MachineRegionInf
[llvm-branch-commits] [llvm] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM (PR #109939)
https://github.com/optimisan updated
https://github.com/llvm/llvm-project/pull/109939
>From af1a1f15867edef93e69c43037a19ab69e8ec2e3 Mon Sep 17 00:00:00 2001
From: Akshat Oke
Date: Tue, 24 Sep 2024 11:41:18 +
Subject: [PATCH 1/2] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM
---
llvm/lib/Target/AMDGPU/AMDGPU.h | 6 +-
llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 1 +
.../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 ++-
.../Target/AMDGPU/SIPreAllocateWWMRegs.cpp| 60 ---
llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h | 25
.../AMDGPU/si-pre-allocate-wwm-regs.mir | 20 +++
6 files changed, 92 insertions(+), 27 deletions(-)
create mode 100644 llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 342d55e828bca5..95d0ad0f9dc96a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -49,7 +49,7 @@ FunctionPass *createSIFixSGPRCopiesLegacyPass();
FunctionPass *createLowerWWMCopiesPass();
FunctionPass *createSIMemoryLegalizerPass();
FunctionPass *createSIInsertWaitcntsPass();
-FunctionPass *createSIPreAllocateWWMRegsPass();
+FunctionPass *createSIPreAllocateWWMRegsLegacyPass();
FunctionPass *createSIFormMemoryClausesPass();
FunctionPass *createSIPostRABundlerPass();
@@ -212,8 +212,8 @@ extern char &SILateBranchLoweringPassID;
void initializeSIOptimizeExecMaskingPass(PassRegistry &);
extern char &SIOptimizeExecMaskingID;
-void initializeSIPreAllocateWWMRegsPass(PassRegistry &);
-extern char &SIPreAllocateWWMRegsID;
+void initializeSIPreAllocateWWMRegsLegacyPass(PassRegistry &);
+extern char &SIPreAllocateWWMRegsLegacyID;
void initializeAMDGPUImageIntrinsicOptimizerPass(PassRegistry &);
extern char &AMDGPUImageIntrinsicOptimizerID;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
index 0ebf34c901c142..174a90f0aa419d 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
@@ -102,5 +102,6 @@ MACHINE_FUNCTION_PASS("gcn-dpp-combine",
GCNDPPCombinePass())
MACHINE_FUNCTION_PASS("si-load-store-opt", SILoadStoreOptimizerPass())
MACHINE_FUNCTION_PASS("si-lower-sgpr-spills", SILowerSGPRSpillsPass())
MACHINE_FUNCTION_PASS("si-peephole-sdwa", SIPeepholeSDWAPass())
+MACHINE_FUNCTION_PASS("si-pre-allocate-wwm-regs", SIPreAllocateWWMRegsPass())
MACHINE_FUNCTION_PASS("si-shrink-instructions", SIShrinkInstructionsPass())
#undef MACHINE_FUNCTION_PASS
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 23ee0c3e896eb3..f367b5fbea45af 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -41,6 +41,7 @@
#include "SIMachineFunctionInfo.h"
#include "SIMachineScheduler.h"
#include "SIPeepholeSDWA.h"
+#include "SIPreAllocateWWMRegs.h"
#include "SIShrinkInstructions.h"
#include "TargetInfo/AMDGPUTargetInfo.h"
#include "Utils/AMDGPUBaseInfo.h"
@@ -508,7 +509,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void
LLVMInitializeAMDGPUTarget() {
initializeSILateBranchLoweringPass(*PR);
initializeSIMemoryLegalizerPass(*PR);
initializeSIOptimizeExecMaskingPass(*PR);
- initializeSIPreAllocateWWMRegsPass(*PR);
+ initializeSIPreAllocateWWMRegsLegacyPass(*PR);
initializeSIFormMemoryClausesPass(*PR);
initializeSIPostRABundlerPass(*PR);
initializeGCNCreateVOPDPass(*PR);
@@ -1506,7 +1507,7 @@ bool GCNPassConfig::addRegAssignAndRewriteFast() {
addPass(&SILowerSGPRSpillsLegacyID);
// To Allocate wwm registers used in whole quad mode operations (for
shaders).
- addPass(&SIPreAllocateWWMRegsID);
+ addPass(&SIPreAllocateWWMRegsLegacyID);
// For allocating other wwm register operands.
addPass(createWWMRegAllocPass(false));
@@ -1543,7 +1544,7 @@ bool GCNPassConfig::addRegAssignAndRewriteOptimized() {
addPass(&SILowerSGPRSpillsLegacyID);
// To Allocate wwm registers used in whole quad mode operations (for
shaders).
- addPass(&SIPreAllocateWWMRegsID);
+ addPass(&SIPreAllocateWWMRegsLegacyID);
// For allocating other whole wave mode registers.
addPass(createWWMRegAllocPass(true));
diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
index 07303e2aa726c5..f9109c01c8085b 100644
--- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
@@ -11,6 +11,7 @@
//
//===--===//
+#include "SIPreAllocateWWMRegs.h"
#include "AMDGPU.h"
#include "GCNSubtarget.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
@@ -34,7 +35,7 @@ static cl::opt
namespace {
-class SIPreAllocateWWMRegs : public MachineFunctionPass {
+class SIPreAllocateWWMRegs {
private:
const SIInstrInfo *TII;
const SIRegisterInfo *TRI;
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
https://github.com/optimisan updated
https://github.com/llvm/llvm-project/pull/109963
>From 2cefaf6d479b6c7ae6bc8a2267f8e4fee274923c Mon Sep 17 00:00:00 2001
From: Akshat Oke
Date: Wed, 25 Sep 2024 11:21:04 +
Subject: [PATCH 1/2] [AMDGPU] Add tests for SIPreAllocateWWMRegs
---
.../AMDGPU/si-pre-allocate-wwm-regs.mir | 26 +++
.../si-pre-allocate-wwm-sgpr-spills.mir | 21 +++
2 files changed, 47 insertions(+)
create mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
create mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
new file mode 100644
index 00..f2db299f575f5e
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
@@ -0,0 +1,26 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s
+
+---
+
+name: pre_allocate_wwm_regs_strict
+tracksRegLiveness: true
+body: |
+ bb.0:
+liveins: $sgpr1
+; CHECK-LABEL: name: pre_allocate_wwm_regs_strict
+; CHECK: liveins: $sgpr1
+; CHECK-NEXT: {{ $}}
+; CHECK-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+; CHECK-NEXT: renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def
$exec, implicit-def $scc, implicit $exec
+; CHECK-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
+; CHECK-NEXT: dead $vgpr0 = V_MOV_B32_dpp $vgpr0, [[DEF]], 323, 12, 15, 0,
implicit $exec
+; CHECK-NEXT: $exec = EXIT_STRICT_WWM killed renamable $sgpr4_sgpr5
+; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]]
+%0:vgpr_32 = IMPLICIT_DEF
+renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec,
implicit-def $scc, implicit $exec
+%24:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+%25:vgpr_32 = V_MOV_B32_dpp %24:vgpr_32(tied-def 0), %0:vgpr_32, 323, 12,
15, 0, implicit $exec
+$exec = EXIT_STRICT_WWM killed renamable $sgpr4_sgpr5
+%2:vgpr_32 = COPY %0:vgpr_32
+...
diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
new file mode 100644
index 00..f0efe74878d831
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
@@ -0,0 +1,21 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o -
-mcpu=tahiti %s | FileCheck %s
+
+---
+
+name: pre_allocate_wwm_spill_to_vgpr
+tracksRegLiveness: true
+body: |
+ bb.0:
+liveins: $sgpr1
+; CHECK-LABEL: name: pre_allocate_wwm_spill_to_vgpr
+; CHECK: liveins: $sgpr1
+; CHECK-NEXT: {{ $}}
+; CHECK-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+; CHECK-NEXT: dead $vgpr0 = SI_SPILL_S32_TO_VGPR $sgpr1, 0, [[DEF]]
+; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]]
+%0:vgpr_32 = IMPLICIT_DEF
+%23:vgpr_32 = SI_SPILL_S32_TO_VGPR $sgpr1, 0, %0:vgpr_32
+%2:vgpr_32 = COPY %0:vgpr_32
+...
+
>From 9bddae336227b80ba45be7d7f16ddc4f49fd0a15 Mon Sep 17 00:00:00 2001
From: Akshat Oke
Date: Mon, 7 Oct 2024 09:13:04 +
Subject: [PATCH 2/2] Keep tests in one file
---
.../AMDGPU/si-pre-allocate-wwm-regs.mir | 24 ---
.../si-pre-allocate-wwm-sgpr-spills.mir | 21
2 files changed, 21 insertions(+), 24 deletions(-)
delete mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
index f2db299f575f5e..74a221084dce24 100644
--- a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
@@ -1,5 +1,6 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
UTC_ARGS: --version 5
# RUN: llc -mtriple=amdgcn -verify-machineinstrs
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o -
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
---
@@ -19,8 +20,25 @@ body: |
; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]]
%0:vgpr_32 = IMPLICIT_DEF
renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec,
implicit-def $scc, implicit $exec
-%24:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
-%25:vgpr_32 = V_MOV_B32_dpp %24:vgpr_32(tied-def 0), %0:vgpr_32, 323, 12,
15, 0, implicit $exec
+%1:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+%2:vgpr_32 = V_MOV_B32_dpp %1, %0, 323, 12, 15, 0, implicit $exec
$exec = EXIT_STRICT_
[llvm-branch-commits] [llvm] [CodeGen] LiveIntervalUnions::Array Implement move constructor (PR #111357)
https://github.com/optimisan updated
https://github.com/llvm/llvm-project/pull/111357
>From dbc51871aab3d4b5d7d64ef78f2df7833359b17f Mon Sep 17 00:00:00 2001
From: Akshat Oke
Date: Mon, 7 Oct 2024 08:42:24 +
Subject: [PATCH] [CodeGen] LiveIntervalUnions::Array Implement move
constructor
---
llvm/include/llvm/CodeGen/LiveIntervalUnion.h | 7 +++
1 file changed, 7 insertions(+)
diff --git a/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
b/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
index 81003455da4241..cc0f2a45bb182c 100644
--- a/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
+++ b/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
@@ -176,6 +176,13 @@ class LiveIntervalUnion {
Array() = default;
~Array() { clear(); }
+Array(Array &&Other) : Size(Other.Size), LIUs(Other.LIUs) {
+ Other.Size = 0;
+ Other.LIUs = nullptr;
+}
+
+Array(const Array &) = delete;
+
// Initialize the array to have Size entries.
// Reuse an existing allocation if the size matches.
void init(LiveIntervalUnion::Allocator&, unsigned Size);
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 arsenm wrote: Move the -mcpu together with -mtriple https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- arsenm wrote: Add fixme to check the MachineFunctionInfo reserved register information https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM (PR #109939)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109939 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- arsenm wrote: Yes https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- optimisan wrote: Is it for WWM reserved registers? https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/112116 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)
@@ -1825,7 +1836,9 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI, // Verify that the wait is actually needed. ScoreBrackets.simplifyWaitcnt(Wait); - if (ForceEmitZeroFlag) + // When forcing emit, we need to skip non-first terminators of a MBB because + // that would break the terminators of the MBB. + if (ForceEmitZeroFlag && !checkIfMBBNonFirstTerminator(MI)) arsenm wrote: You're scanning the terminators for every instruction. Can you adjust the outer iterator logic to skip the terminators in the first place? https://github.com/llvm/llvm-project/pull/112116 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)
@@ -1600,6 +1600,17 @@ static bool callWaitsOnFunctionReturn(const MachineInstr
&MI) {
return true;
}
+/// \returns true if \p MI is not the first terminator of its associated MBB.
+static bool checkIfMBBNonFirstTerminator(const MachineInstr &MI) {
+ const auto &MBB = MI.getParent();
+ if (MBB->getFirstTerminator() == MI)
+return false;
+ for (const auto &I : MBB->terminators())
+if (&I == &MI)
+ return true;
arsenm wrote:
This iterator logic is clumsy (you're effectively using getFirstTerminator
twice)
https://github.com/llvm/llvm-project/pull/112116
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Update correct dependency (PR #109937)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/109937 >From ca685074a7f8bfc75e40dd8172ce9e731e991f4d Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Tue, 24 Sep 2024 06:35:43 + Subject: [PATCH] Update correct dependency Replace unused analysis (VirtRegMap) dependency with the used one (SlotIndexes) --- llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp index 4afefa3d9b245c..d8697aa2ffe1cd 100644 --- a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp +++ b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp @@ -95,8 +95,8 @@ char SILowerSGPRSpillsLegacy::ID = 0; INITIALIZE_PASS_BEGIN(SILowerSGPRSpillsLegacy, DEBUG_TYPE, "SI lower SGPR spill instructions", false, false) INITIALIZE_PASS_DEPENDENCY(LiveIntervalsWrapperPass) -INITIALIZE_PASS_DEPENDENCY(VirtRegMapWrapperLegacy) INITIALIZE_PASS_DEPENDENCY(MachineDominatorTreeWrapperPass) +INITIALIZE_PASS_DEPENDENCY(SlotIndexesWrapperPass) INITIALIZE_PASS_END(SILowerSGPRSpillsLegacy, DEBUG_TYPE, "SI lower SGPR spill instructions", false, false) ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- arsenm wrote: Then you need to write manual checks for it. Update_mir_test_checks doesn't currently support the function level properties https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- optimisan wrote: But we have that already(?) ``` wwmReservedRegs: - '$vgpr0' ``` https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
