[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -68,6 +156,57 @@ struct SLSHardeningInserter : 
ThunkInserter {
 
 } // end anonymous namespace
 
+const ThunkKind ThunkKind::BR = {ThunkBR, "", false, false, AArch64::BR};
+const ThunkKind ThunkKind::BRAA = {ThunkBRAA, "aa_", true, true, 
AArch64::BRAA};
+const ThunkKind ThunkKind::BRAB = {ThunkBRAB, "ab_", true, true, 
AArch64::BRAB};
+const ThunkKind ThunkKind::BRAAZ = {ThunkBRAAZ, "aaz_", false, true,
+AArch64::BRAAZ};
+const ThunkKind ThunkKind::BRABZ = {ThunkBRABZ, "abz_", false, true,
+AArch64::BRABZ};

kbeyls wrote:

very minor nitpick: These probably can go directly after the definition of 
`struct ThunkKind`?
If so, it's easier to understand what all the "false" and "true"s mean here 
without having to scroll a lot to where `struct ThunkKind` is defined.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objdump] -r: support CREL (PR #97382)

2024-07-05 Thread James Henderson via llvm-branch-commits


@@ -1117,9 +1155,11 @@ void ELFObjectFile::getRelocationTypeName(
 template 
 Expected
 ELFObjectFile::getRelocationAddend(DataRefImpl Rel) const {
-  if (getRelSection(Rel)->sh_type != ELF::SHT_RELA)
-return createError("Section is not SHT_RELA");
-  return (int64_t)getRela(Rel)->r_addend;
+  if (getRelSection(Rel)->sh_type == ELF::SHT_RELA)
+return (int64_t)getRela(Rel)->r_addend;
+  if (getRelSection(Rel)->sh_type == ELF::SHT_CREL)
+return (int64_t)getCrel(Rel).r_addend;
+  return createError("Section is not SHT_RELA");

jh7370 wrote:

I'm not sure this error quite makes sense anymore. Probably needs to say 
something about addends.

https://github.com/llvm/llvm-project/pull/97382
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen] Add dump() to MachineTraceMetrics.h (PR #97799)

2024-07-05 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/97799

To enhance debugging.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-05 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-05 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff878..7a26e1956424cb 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20e..8b52e3fe7b2f15 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-07-05 Thread Chuanqi Xu via llvm-branch-commits

ChuanqiXu9 wrote:

Sorry for bothering, I tried a new manner to update the patch but it shows I 
should better : (

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [libc] [lld] [lldb] [llvm] [Serialization] Introduce OnDiskHashTable for specializations (PR #83233)

2024-07-05 Thread Chuanqi Xu via llvm-branch-commits

https://github.com/ChuanqiXu9 updated 
https://github.com/llvm/llvm-project/pull/83233

error: too big or took too long to generate
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 82a6d15 - Revert "Revert "[symbolizer] Empty string is not an error" (#94424)"

2024-07-04 Thread via llvm-branch-commits

Author: Serge Pavlov
Date: 2024-07-05T09:10:03+07:00
New Revision: 82a6d1572ff4de1491ea58eb167967350eace9fa

URL: 
https://github.com/llvm/llvm-project/commit/82a6d1572ff4de1491ea58eb167967350eace9fa
DIFF: 
https://github.com/llvm/llvm-project/commit/82a6d1572ff4de1491ea58eb167967350eace9fa.diff

LOG: Revert "Revert "[symbolizer] Empty string is not an error" (#94424)"

This reverts commit c0bb16eaf7a6c16edadfd05ba4168fa536c227e2.

Added: 


Modified: 
llvm/test/tools/llvm-symbolizer/get-input-file.test
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp

Removed: 




diff  --git a/llvm/test/tools/llvm-symbolizer/get-input-file.test 
b/llvm/test/tools/llvm-symbolizer/get-input-file.test
index 8c21816591c81..50eb051968718 100644
--- a/llvm/test/tools/llvm-symbolizer/get-input-file.test
+++ b/llvm/test/tools/llvm-symbolizer/get-input-file.test
@@ -1,9 +1,9 @@
 # If binary input file is not specified, llvm-symbolizer assumes it is the 
first
 # item in the command.
 
-# No input items at all, complain about missing input file.
+# No input items at all. Report an unknown line, but do not produce any output 
on stderr.
 RUN: echo | llvm-symbolizer 2>%t.1.err | FileCheck %s --check-prefix=NOSOURCE
-RUN: FileCheck --input-file=%t.1.err --check-prefix=NOFILE %s
+RUN: FileCheck --input-file=%t.1.err --implicit-check-not={{.}} --allow-empty 
%s
 
 # Only one input item, complain about missing addresses.
 RUN: llvm-symbolizer "foo" 2>%t.2.err | FileCheck %s --check-prefix=NOSOURCE
@@ -32,8 +32,6 @@ RUN: FileCheck --input-file=%t.7.err --check-prefix=BAD-QUOTE 
%s
 NOSOURCE:  ??
 NOSOURCE-NEXT: ??:0:0
 
-NOFILE: error: no input filename has been specified
-
 NOADDR: error: 'foo': no module offset has been specified
 
 NOTFOUND:  error: 'foo': [[MSG]]

diff  --git a/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp 
b/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
index b98bdbc388faf..6d7953f3109a5 100644
--- a/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
+++ b/llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
@@ -337,6 +337,14 @@ static void symbolizeInput(const opt::InputArgList ,
   object::BuildID BuildID(IncomingBuildID.begin(), IncomingBuildID.end());
   uint64_t Offset = 0;
   StringRef Symbol;
+
+  // An empty input string may be used to check if the process is alive and
+  // responding to input. Do not emit a message on stderr in this case but
+  // respond on stdout.
+  if (InputString.empty()) {
+printUnknownLineInfo(ModuleName, Printer);
+return;
+  }
   if (Error E = parseCommand(Args.getLastArgValue(OPT_obj_EQ), IsAddr2Line,
  StringRef(InputString), Cmd, ModuleName, BuildID,
  Symbol, Offset)) {



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-04 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/97605

>From 84e7eb3c36b99ac7f673d9a7ad0c88c469f7f45d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 1 Jul 2024 20:13:54 +0300
Subject: [PATCH 1/2] [AArch64][PAC] Support BLRA* instructions in SLS
 Hardening pass

Make SLS Hardening pass handle BLRA* instructions the same way it
handles BLR. The thunk names have the form

__llvm_slsblr_thunk_xNfor BLR thunks
__llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
__llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks

Now there are about 1800 possible thunk names, so do not rely on linear
thunk function's name lookup and parse the name instead.
---
 .../Target/AArch64/AArch64SLSHardening.cpp| 326 --
 .../speculation-hardening-sls-blra.mir| 210 +++
 2 files changed, 432 insertions(+), 104 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/speculation-hardening-sls-blra.mir

diff --git a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp 
b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
index 00ba31b3e500d..4ae2e48af337f 100644
--- a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
@@ -13,6 +13,7 @@
 
 #include "AArch64InstrInfo.h"
 #include "AArch64Subtarget.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/CodeGen/IndirectThunks.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunction.h"
@@ -23,6 +24,7 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Target/TargetMachine.h"
 #include 
 
@@ -32,17 +34,103 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
-static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+// Common name prefix of all thunks generated by this pass.
+//
+// The generic form is
+// __llvm_slsblr_thunk_xNfor BLR thunks
+// __llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
+// __llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks
+static constexpr StringRef CommonNamePrefix = "__llvm_slsblr_thunk_";
 
 namespace {
 
-// Set of inserted thunks: bitmask with bits corresponding to
-// indexes in SLSBLRThunks array.
-typedef uint32_t ThunksSet;
+struct ThunkKind {
+  enum ThunkKindId {
+ThunkBR,
+ThunkBRAA,
+ThunkBRAB,
+ThunkBRAAZ,
+ThunkBRABZ,
+  };
+
+  ThunkKindId Id;
+  StringRef NameInfix;
+  bool HasXmOperand;
+  bool NeedsPAuth;
+
+  // Opcode to perform indirect jump from inside the thunk.
+  unsigned BROpcode;
+
+  static const ThunkKind BR;
+  static const ThunkKind BRAA;
+  static const ThunkKind BRAB;
+  static const ThunkKind BRAAZ;
+  static const ThunkKind BRABZ;
+};
+
+// Set of inserted thunks.
+class ThunksSet {
+public:
+  static constexpr unsigned NumXRegisters = 32;
+
+  // Given Xn register, returns n.
+  static unsigned indexOfXReg(Register Xn);
+  // Given n, returns Xn register.
+  static Register xRegByIndex(unsigned N);
+
+  ThunksSet |=(const ThunksSet ) {
+BLRThunks |= Other.BLRThunks;
+BLRAAZThunks |= Other.BLRAAZThunks;
+BLRABZThunks |= Other.BLRABZThunks;
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRAAThunks[I] |= Other.BLRAAThunks[I];
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRABThunks[I] |= Other.BLRABThunks[I];
+
+return *this;
+  }
+
+  bool get(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+return getBitmask(Kind, Xm) & XnBit;
+  }
+
+  void set(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+getBitmask(Kind, Xm) |= XnBit;
+  }
+
+private:
+  // Bitmasks representing operands used, with n-th bit corresponding to Xn
+  // register operand. If the instruction has a second operand (Xm), an array
+  // of bitmasks is used, indexed by m.
+  // Indexes corresponding to the forbidden x16, x17 and x30 registers are
+  // always unset, for simplicity there are no holes.
+  uint32_t BLRThunks = 0;
+  uint32_t BLRAAZThunks = 0;
+  uint32_t BLRABZThunks = 0;
+  uint32_t BLRAAThunks[NumXRegisters] = {};
+  uint32_t BLRABThunks[NumXRegisters] = {};
+
+  uint32_t (ThunkKind::ThunkKindId Kind, Register Xm) {
+switch (Kind) {
+case ThunkKind::ThunkBR:
+  return BLRThunks;
+case ThunkKind::ThunkBRAAZ:
+  return BLRAAZThunks;
+case ThunkKind::ThunkBRABZ:
+  return BLRABZThunks;
+case ThunkKind::ThunkBRAA:
+  return BLRAAThunks[indexOfXReg(Xm)];
+case ThunkKind::ThunkBRAB:
+  return BLRABThunks[indexOfXReg(Xm)];
+}
+  }
+};
 
 struct SLSHardeningInserter : ThunkInserter {
 public:
-  const char *getThunkPrefix() { return SLSBLRNamePrefix; }
+  const char *getThunkPrefix() { return CommonNamePrefix.data(); }
   bool mayUseThunk(const 

[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-04 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/97605

>From 84e7eb3c36b99ac7f673d9a7ad0c88c469f7f45d Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 1 Jul 2024 20:13:54 +0300
Subject: [PATCH] [AArch64][PAC] Support BLRA* instructions in SLS Hardening
 pass

Make SLS Hardening pass handle BLRA* instructions the same way it
handles BLR. The thunk names have the form

__llvm_slsblr_thunk_xNfor BLR thunks
__llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
__llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks

Now there are about 1800 possible thunk names, so do not rely on linear
thunk function's name lookup and parse the name instead.
---
 .../Target/AArch64/AArch64SLSHardening.cpp| 326 --
 .../speculation-hardening-sls-blra.mir| 210 +++
 2 files changed, 432 insertions(+), 104 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/speculation-hardening-sls-blra.mir

diff --git a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp 
b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
index 00ba31b3e500d..4ae2e48af337f 100644
--- a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
@@ -13,6 +13,7 @@
 
 #include "AArch64InstrInfo.h"
 #include "AArch64Subtarget.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/CodeGen/IndirectThunks.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunction.h"
@@ -23,6 +24,7 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Target/TargetMachine.h"
 #include 
 
@@ -32,17 +34,103 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
-static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+// Common name prefix of all thunks generated by this pass.
+//
+// The generic form is
+// __llvm_slsblr_thunk_xNfor BLR thunks
+// __llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
+// __llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks
+static constexpr StringRef CommonNamePrefix = "__llvm_slsblr_thunk_";
 
 namespace {
 
-// Set of inserted thunks: bitmask with bits corresponding to
-// indexes in SLSBLRThunks array.
-typedef uint32_t ThunksSet;
+struct ThunkKind {
+  enum ThunkKindId {
+ThunkBR,
+ThunkBRAA,
+ThunkBRAB,
+ThunkBRAAZ,
+ThunkBRABZ,
+  };
+
+  ThunkKindId Id;
+  StringRef NameInfix;
+  bool HasXmOperand;
+  bool NeedsPAuth;
+
+  // Opcode to perform indirect jump from inside the thunk.
+  unsigned BROpcode;
+
+  static const ThunkKind BR;
+  static const ThunkKind BRAA;
+  static const ThunkKind BRAB;
+  static const ThunkKind BRAAZ;
+  static const ThunkKind BRABZ;
+};
+
+// Set of inserted thunks.
+class ThunksSet {
+public:
+  static constexpr unsigned NumXRegisters = 32;
+
+  // Given Xn register, returns n.
+  static unsigned indexOfXReg(Register Xn);
+  // Given n, returns Xn register.
+  static Register xRegByIndex(unsigned N);
+
+  ThunksSet |=(const ThunksSet ) {
+BLRThunks |= Other.BLRThunks;
+BLRAAZThunks |= Other.BLRAAZThunks;
+BLRABZThunks |= Other.BLRABZThunks;
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRAAThunks[I] |= Other.BLRAAThunks[I];
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRABThunks[I] |= Other.BLRABThunks[I];
+
+return *this;
+  }
+
+  bool get(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+return getBitmask(Kind, Xm) & XnBit;
+  }
+
+  void set(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+getBitmask(Kind, Xm) |= XnBit;
+  }
+
+private:
+  // Bitmasks representing operands used, with n-th bit corresponding to Xn
+  // register operand. If the instruction has a second operand (Xm), an array
+  // of bitmasks is used, indexed by m.
+  // Indexes corresponding to the forbidden x16, x17 and x30 registers are
+  // always unset, for simplicity there are no holes.
+  uint32_t BLRThunks = 0;
+  uint32_t BLRAAZThunks = 0;
+  uint32_t BLRABZThunks = 0;
+  uint32_t BLRAAThunks[NumXRegisters] = {};
+  uint32_t BLRABThunks[NumXRegisters] = {};
+
+  uint32_t (ThunkKind::ThunkKindId Kind, Register Xm) {
+switch (Kind) {
+case ThunkKind::ThunkBR:
+  return BLRThunks;
+case ThunkKind::ThunkBRAAZ:
+  return BLRAAZThunks;
+case ThunkKind::ThunkBRABZ:
+  return BLRABZThunks;
+case ThunkKind::ThunkBRAA:
+  return BLRAAThunks[indexOfXReg(Xm)];
+case ThunkKind::ThunkBRAB:
+  return BLRABThunks[indexOfXReg(Xm)];
+}
+  }
+};
 
 struct SLSHardeningInserter : ThunkInserter {
 public:
-  const char *getThunkPrefix() { return SLSBLRNamePrefix; }
+  const char *getThunkPrefix() { return CommonNamePrefix.data(); }
   bool mayUseThunk(const MachineFunction ) 

[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-04 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/97605

>From f49c32c8465e9e68d7345fa82ae1294cc2faf0e7 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 1 Jul 2024 20:13:54 +0300
Subject: [PATCH] [AArch64][PAC] Support BLRA* instructions in SLS Hardening
 pass

Make SLS Hardening pass handle BLRA* instructions the same way it
handles BLR. The thunk names have the form

__llvm_slsblr_thunk_xNfor BLR thunks
__llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
__llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks

Now there are about 1800 possible thunk names, so do not rely on linear
thunk function's name lookup and parse the name instead.
---
 .../Target/AArch64/AArch64SLSHardening.cpp| 326 --
 .../speculation-hardening-sls-blra.mir| 210 +++
 2 files changed, 432 insertions(+), 104 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/speculation-hardening-sls-blra.mir

diff --git a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp 
b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
index 24f023a3d70e7..4ae2e48af337f 100644
--- a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
@@ -13,6 +13,7 @@
 
 #include "AArch64InstrInfo.h"
 #include "AArch64Subtarget.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/CodeGen/IndirectThunks.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunction.h"
@@ -23,6 +24,7 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Target/TargetMachine.h"
 #include 
 
@@ -32,17 +34,103 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
-static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+// Common name prefix of all thunks generated by this pass.
+//
+// The generic form is
+// __llvm_slsblr_thunk_xNfor BLR thunks
+// __llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
+// __llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks
+static constexpr StringRef CommonNamePrefix = "__llvm_slsblr_thunk_";
 
 namespace {
 
-// Set of inserted thunks: bitmask with bits corresponding to
-// indexes in SLSBLRThunks array.
-typedef uint32_t ThunksSet;
+struct ThunkKind {
+  enum ThunkKindId {
+ThunkBR,
+ThunkBRAA,
+ThunkBRAB,
+ThunkBRAAZ,
+ThunkBRABZ,
+  };
+
+  ThunkKindId Id;
+  StringRef NameInfix;
+  bool HasXmOperand;
+  bool NeedsPAuth;
+
+  // Opcode to perform indirect jump from inside the thunk.
+  unsigned BROpcode;
+
+  static const ThunkKind BR;
+  static const ThunkKind BRAA;
+  static const ThunkKind BRAB;
+  static const ThunkKind BRAAZ;
+  static const ThunkKind BRABZ;
+};
+
+// Set of inserted thunks.
+class ThunksSet {
+public:
+  static constexpr unsigned NumXRegisters = 32;
+
+  // Given Xn register, returns n.
+  static unsigned indexOfXReg(Register Xn);
+  // Given n, returns Xn register.
+  static Register xRegByIndex(unsigned N);
+
+  ThunksSet |=(const ThunksSet ) {
+BLRThunks |= Other.BLRThunks;
+BLRAAZThunks |= Other.BLRAAZThunks;
+BLRABZThunks |= Other.BLRABZThunks;
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRAAThunks[I] |= Other.BLRAAThunks[I];
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRABThunks[I] |= Other.BLRABThunks[I];
+
+return *this;
+  }
+
+  bool get(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+return getBitmask(Kind, Xm) & XnBit;
+  }
+
+  void set(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+getBitmask(Kind, Xm) |= XnBit;
+  }
+
+private:
+  // Bitmasks representing operands used, with n-th bit corresponding to Xn
+  // register operand. If the instruction has a second operand (Xm), an array
+  // of bitmasks is used, indexed by m.
+  // Indexes corresponding to the forbidden x16, x17 and x30 registers are
+  // always unset, for simplicity there are no holes.
+  uint32_t BLRThunks = 0;
+  uint32_t BLRAAZThunks = 0;
+  uint32_t BLRABZThunks = 0;
+  uint32_t BLRAAThunks[NumXRegisters] = {};
+  uint32_t BLRABThunks[NumXRegisters] = {};
+
+  uint32_t (ThunkKind::ThunkKindId Kind, Register Xm) {
+switch (Kind) {
+case ThunkKind::ThunkBR:
+  return BLRThunks;
+case ThunkKind::ThunkBRAAZ:
+  return BLRAAZThunks;
+case ThunkKind::ThunkBRABZ:
+  return BLRABZThunks;
+case ThunkKind::ThunkBRAA:
+  return BLRAAThunks[indexOfXReg(Xm)];
+case ThunkKind::ThunkBRAB:
+  return BLRABThunks[indexOfXReg(Xm)];
+}
+  }
+};
 
 struct SLSHardeningInserter : ThunkInserter {
 public:
-  const char *getThunkPrefix() { return SLSBLRNamePrefix; }
+  const char *getThunkPrefix() { return CommonNamePrefix.data(); }
   bool mayUseThunk(const MachineFunction ) 

[llvm-branch-commits] [flang] [mlir] [Flang][OpenMP] Add lowering support for DO SIMD (PR #97718)

2024-07-04 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-mlir-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch adds support for lowering 'DO SIMD' constructs to MLIR. SIMD 
information is now stored in an `omp.simd` loop wrapper, which is currently 
ignored by the OpenMP dialect to LLVM IR translation stage.

The end result is that runtime behavior of compiled 'DO SIMD' constructs does 
not change after this patch, so 'DO SIMD' still runs like 'DO' (i.e. SIMD width 
= 1). However, all of the required information is now present in the resulting 
MLIR representation.

To avoid confusion, the previous wsloop-simd.f90 lit test is renamed to 
wsloop-schedule.f90 and a new wsloop-simd.f90 test is created to check the 
addition of SIMD clauses to the `omp.simd` operation produced when a 'DO SIMD' 
construct is lowered to MLIR.

---
Full diff: https://github.com/llvm/llvm-project/pull/97718.diff


10 Files Affected:

- (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+36-13) 
- (removed) flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90 (-16) 
- (modified) flang/test/Lower/OpenMP/Todo/omp-do-simd-linear.f90 (+1-1) 
- (removed) flang/test/Lower/OpenMP/Todo/omp-do-simd-safelen.f90 (-14) 
- (removed) flang/test/Lower/OpenMP/Todo/omp-do-simd-simdlen.f90 (-14) 
- (modified) flang/test/Lower/OpenMP/if-clause.f90 (+31) 
- (modified) flang/test/Lower/OpenMP/loop-compound.f90 (+3) 
- (added) flang/test/Lower/OpenMP/wsloop-schedule.f90 (+37) 
- (modified) flang/test/Lower/OpenMP/wsloop-simd.f90 (+42-32) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+3) 


``diff
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp 
b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 1830e31349cfb..9c23888a87173 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -2060,19 +2060,42 @@ static void genCompositeDoSimd(lower::AbstractConverter 
,
const ConstructQueue ,
ConstructQueue::iterator item,
DataSharingProcessor ) {
-  ClauseProcessor cp(converter, semaCtx, item->clauses);
-  cp.processTODO(loc,
-   llvm::omp::OMPD_do_simd);
-  // TODO: Add support for vectorization - add vectorization hints inside loop
-  // body.
-  // OpenMP standard does not specify the length of vector instructions.
-  // Currently we safely assume that for !$omp do simd pragma the SIMD length
-  // is equal to 1 (i.e. we generate standard workshare loop).
-  // When support for vectorization is enabled, then we need to add handling of
-  // if clause. Currently if clause can be skipped because we always assume
-  // SIMD length = 1.
-  genStandaloneDo(converter, symTable, semaCtx, eval, loc, queue, item, dsp);
+  lower::StatementContext stmtCtx;
+
+  // Clause processing.
+  mlir::omp::WsloopClauseOps wsloopClauseOps;
+  llvm::SmallVector wsloopReductionSyms;
+  llvm::SmallVector wsloopReductionTypes;
+  genWsloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc,
+   wsloopClauseOps, wsloopReductionTypes, wsloopReductionSyms);
+
+  mlir::omp::SimdClauseOps simdClauseOps;
+  genSimdClauses(converter, semaCtx, item->clauses, loc, simdClauseOps);
+
+  mlir::omp::LoopNestClauseOps loopNestClauseOps;
+  llvm::SmallVector iv;
+  genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc,
+ loopNestClauseOps, iv);
+
+  // Operation creation.
+  auto wsloopOp =
+  genWsloopWrapperOp(converter, semaCtx, eval, loc, wsloopClauseOps,
+ wsloopReductionSyms, wsloopReductionTypes);
+
+  auto simdOp = genSimdWrapperOp(converter, semaCtx, eval, loc, simdClauseOps);
+
+  // Construct wrapper entry block list and associated symbols. It is important
+  // that the symbol and block argument order match, so that the symbol-value
+  // bindings created are correct.
+  // TODO: Add omp.wsloop private and omp.simd private and reduction args.
+  auto wrapperArgs = llvm::to_vector(llvm::concat(
+  wsloopOp.getRegion().getArguments(), simdOp.getRegion().getArguments()));
+
+  assert(wsloopReductionSyms.size() == wrapperArgs.size() &&
+ "Number of symbols and wrapper block arguments must match");
+  genLoopNestOp(converter, symTable, semaCtx, eval, loc, queue, item,
+loopNestClauseOps, iv, wsloopReductionSyms, wrapperArgs,
+llvm::omp::Directive::OMPD_do_simd, dsp);
 }
 
 static void genCompositeTaskloopSimd(
diff --git a/flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90 
b/flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90
deleted file mode 100644
index b62c54182442a..0
--- a/flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90
+++ /dev/null
@@ -1,16 +0,0 @@
-! This test checks lowering of OpenMP do simd aligned() pragma
-
-! RUN: %not_todo_cmd bbc -emit-fir -fopenmp -o - %s 2>&1 | FileCheck %s
-! RUN: %not_todo_cmd 

[llvm-branch-commits] [flang] [mlir] [Flang][OpenMP] Add lowering support for DO SIMD (PR #97718)

2024-07-04 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/97718

This patch adds support for lowering 'DO SIMD' constructs to MLIR. SIMD 
information is now stored in an `omp.simd` loop wrapper, which is currently 
ignored by the OpenMP dialect to LLVM IR translation stage.

The end result is that runtime behavior of compiled 'DO SIMD' constructs does 
not change after this patch, so 'DO SIMD' still runs like 'DO' (i.e. SIMD width 
= 1). However, all of the required information is now present in the resulting 
MLIR representation.

To avoid confusion, the previous wsloop-simd.f90 lit test is renamed to 
wsloop-schedule.f90 and a new wsloop-simd.f90 test is created to check the 
addition of SIMD clauses to the `omp.simd` operation produced when a 'DO SIMD' 
construct is lowered to MLIR.

>From cca42d31ed43dee5521605f50f3fed380d034f70 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Thu, 4 Jul 2024 12:56:43 +0100
Subject: [PATCH] [Flang][OpenMP] Add lowering support for DO SIMD

This patch adds support for lowering 'DO SIMD' constructs to MLIR. SIMD
information is now stored in an `omp.simd` loop wrapper, which is currently
ignored by the OpenMP dialect to LLVM IR translation stage.

The end result is that runtime behavior of compiled 'DO SIMD' constructs does
not change after this patch, so 'DO SIMD' still runs like 'DO' (i.e. SIMD width
= 1). However, all of the required information is now present in the resulting
MLIR representation.

To avoid confusion, the previous wsloop-simd.f90 lit test is renamed to
wsloop-schedule.f90 and a new wsloop-simd.f90 test is created to check the
addition of SIMD clauses to the `omp.simd` operation produced when a 'DO SIMD'
construct is lowered to MLIR.
---
 flang/lib/Lower/OpenMP/OpenMP.cpp | 49 
 .../Lower/OpenMP/Todo/omp-do-simd-aligned.f90 | 16 
 .../Lower/OpenMP/Todo/omp-do-simd-linear.f90  |  2 +-
 .../Lower/OpenMP/Todo/omp-do-simd-safelen.f90 | 14 
 .../Lower/OpenMP/Todo/omp-do-simd-simdlen.f90 | 14 
 flang/test/Lower/OpenMP/if-clause.f90 | 31 
 flang/test/Lower/OpenMP/loop-compound.f90 |  3 +
 flang/test/Lower/OpenMP/wsloop-schedule.f90   | 37 ++
 flang/test/Lower/OpenMP/wsloop-simd.f90   | 74 +++
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |  3 +
 10 files changed, 153 insertions(+), 90 deletions(-)
 delete mode 100644 flang/test/Lower/OpenMP/Todo/omp-do-simd-aligned.f90
 delete mode 100644 flang/test/Lower/OpenMP/Todo/omp-do-simd-safelen.f90
 delete mode 100644 flang/test/Lower/OpenMP/Todo/omp-do-simd-simdlen.f90
 create mode 100644 flang/test/Lower/OpenMP/wsloop-schedule.f90

diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp 
b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 1830e31349cfb..9c23888a87173 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -2060,19 +2060,42 @@ static void genCompositeDoSimd(lower::AbstractConverter 
,
const ConstructQueue ,
ConstructQueue::iterator item,
DataSharingProcessor ) {
-  ClauseProcessor cp(converter, semaCtx, item->clauses);
-  cp.processTODO(loc,
-   llvm::omp::OMPD_do_simd);
-  // TODO: Add support for vectorization - add vectorization hints inside loop
-  // body.
-  // OpenMP standard does not specify the length of vector instructions.
-  // Currently we safely assume that for !$omp do simd pragma the SIMD length
-  // is equal to 1 (i.e. we generate standard workshare loop).
-  // When support for vectorization is enabled, then we need to add handling of
-  // if clause. Currently if clause can be skipped because we always assume
-  // SIMD length = 1.
-  genStandaloneDo(converter, symTable, semaCtx, eval, loc, queue, item, dsp);
+  lower::StatementContext stmtCtx;
+
+  // Clause processing.
+  mlir::omp::WsloopClauseOps wsloopClauseOps;
+  llvm::SmallVector wsloopReductionSyms;
+  llvm::SmallVector wsloopReductionTypes;
+  genWsloopClauses(converter, semaCtx, stmtCtx, item->clauses, loc,
+   wsloopClauseOps, wsloopReductionTypes, wsloopReductionSyms);
+
+  mlir::omp::SimdClauseOps simdClauseOps;
+  genSimdClauses(converter, semaCtx, item->clauses, loc, simdClauseOps);
+
+  mlir::omp::LoopNestClauseOps loopNestClauseOps;
+  llvm::SmallVector iv;
+  genLoopNestClauses(converter, semaCtx, eval, item->clauses, loc,
+ loopNestClauseOps, iv);
+
+  // Operation creation.
+  auto wsloopOp =
+  genWsloopWrapperOp(converter, semaCtx, eval, loc, wsloopClauseOps,
+ wsloopReductionSyms, wsloopReductionTypes);
+
+  auto simdOp = genSimdWrapperOp(converter, semaCtx, eval, loc, simdClauseOps);
+
+  // Construct wrapper entry block list and associated symbols. It is important
+  // that the symbol and block argument order match, so that the symbol-value
+  // bindings created are correct.
+ 

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-04 Thread Christudasan Devadasan via llvm-branch-commits


@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr 
addrspace(3) %ptr, <2 x half>
 define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) 
%ptr, <2 x i16> %data) {
 ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret:
 ; GFX940:   ; %bb.0:
-; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24
+; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24

cdevadas wrote:

Opened https://github.com/llvm/llvm-project/issues/97715.

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] NFC: Share DataSharingProcessor creation logic for all loop directives (PR #97565)

2024-07-04 Thread Michael Klemm via llvm-branch-commits

https://github.com/mjklemm approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/97565
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] NFC: Remove unused argument for omp.target lowering (PR #97564)

2024-07-04 Thread Michael Klemm via llvm-branch-commits

https://github.com/mjklemm approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/97564
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Prevent allocas from being inserted into loop wrappers (PR #97563)

2024-07-04 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.

Code changes look good.

I would prefer to see a lit test for this code path. It is good enough if this 
is used by a test added later in your PR stack. Otherwise, please could you add 
a lit test which hits this condition so that we can catch if this gets broken 
by any changes in the future.

https://github.com/llvm/llvm-project/pull/97563
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-04 Thread Matt Arsenault via llvm-branch-commits


@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr 
addrspace(3) %ptr, <2 x half>
 define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) 
%ptr, <2 x i16> %data) {
 ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret:
 ; GFX940:   ; %bb.0:
-; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24
+; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24

arsenm wrote:

Can you open an issue for this 

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Refactor loop-related lowering for composite support (PR #97566)

2024-07-04 Thread Tom Eccles via llvm-branch-commits


@@ -1492,7 +1466,7 @@ genParallelOp(lower::AbstractConverter , 
lower::SymMap ,
 firOpBuilder.createBlock(, /*insertPt=*/{}, allRegionArgTypes,
  allRegionArgLocs);
 
-llvm::SmallVector allSymbols = reductionSyms;
+llvm::SmallVector allSymbols(reductionSyms);

tblah wrote:

ultra-nit, feel free to ignore

Personally I don't like using `()` here because it can parse (to humans and 
computers) like a function declaration. 
```suggestion
llvm::SmallVector allSymbols{reductionSyms};
```

https://github.com/llvm/llvm-project/pull/97566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Refactor loop-related lowering for composite support (PR #97566)

2024-07-04 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah commented:

This looks good overall. Do you expect there to be a lot more wrapper 
operations in the near future? If so, they look like there is a common pattern 
that could be further abstracted. Something like

```c++
template 
static OP
genWrapperOp(..., llvm::ArrayRef extraBlockArgTypes) {
  fir::FirOpBuilder  = ...;

  auto op = firOpBuilder.create(loc, clauseOps);

  llvm::SmallVector blockArgLocs(extraBlockArgTypes.size(), 
loc);
  firOpBuilder.createBlock(...)
  firOpBuilder.setInsertionPoint(...)

  return op;
}
```

https://github.com/llvm/llvm-project/pull/97566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Refactor loop-related lowering for composite support (PR #97566)

2024-07-04 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/97566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Refactor loop-related lowering for composite support (PR #97566)

2024-07-04 Thread Tom Eccles via llvm-branch-commits


@@ -518,8 +518,8 @@ struct OpWithBodyGenInfo {
   }
 
   OpWithBodyGenInfo &
-  setReductions(llvm::SmallVectorImpl *value1,
-llvm::SmallVectorImpl *value2) {
+  setReductions(llvm::ArrayRef *value1,
+llvm::ArrayRef *value2) {

tblah wrote:

Can we pass these by value instead of as pointers now?

https://github.com/llvm/llvm-project/pull/97566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits

wangpc-pp wrote:

https://github.com/llvm/llvm-project/pull/97708 is splitted out for adding 
`FeaturePredictableSelectIsExpensive`.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] NFC: Share DataSharingProcessor creation logic for all loop directives (PR #97565)

2024-07-04 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/97565
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for memory atomic fadd f64 (PR #96444)

2024-07-04 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96444

>From 4590c05051bbf57f0b269b2561aa1a7c74f06fbc Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 17:07:53 +0200
Subject: [PATCH] AMDGPU: Add subtarget feature for memory atomic fadd f64

---
 llvm/lib/Target/AMDGPU/AMDGPU.td   | 21 ++---
 llvm/lib/Target/AMDGPU/BUFInstructions.td  | 10 ++
 llvm/lib/Target/AMDGPU/FLATInstructions.td |  6 +++---
 llvm/lib/Target/AMDGPU/GCNSubtarget.h  | 10 +++---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp  |  2 +-
 5 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index bea233bfb27bd..94e8e77b3c052 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureFlatBufferGlobalAtomicFaddF64Inst
+  : SubtargetFeature<"flat-buffer-global-fadd-f64-inst",
+  "HasFlatBufferGlobalAtomicFaddF64Inst",
+  "true",
+  "Has flat, buffer, and global instructions for f64 atomic fadd"
+>;
+
 def FeatureMemoryAtomicFAddF32DenormalSupport
   : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
   "HasMemoryAtomicFaddF32DenormalSupport",
@@ -1390,7 +1397,8 @@ def FeatureISAVersion9_0_A : FeatureSet<
  FeatureBackOffBarrier,
  FeatureKernargPreload,
  FeatureAtomicFMinFMaxF64GlobalInsts,
- FeatureAtomicFMinFMaxF64FlatInsts
+ FeatureAtomicFMinFMaxF64FlatInsts,
+ FeatureFlatBufferGlobalAtomicFaddF64Inst
  ])>;
 
 def FeatureISAVersion9_0_C : FeatureSet<
@@ -1435,7 +1443,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
-   FeatureMemoryAtomicFAddF32DenormalSupport
+   FeatureMemoryAtomicFAddF32DenormalSupport,
+   FeatureFlatBufferGlobalAtomicFaddF64Inst
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1932,11 +1941,9 @@ def isGFX12Plus :
 def HasFlatAddressSpace : Predicate<"Subtarget->hasFlatAddressSpace()">,
   AssemblerPredicate<(all_of FeatureFlatAddressSpace)>;
 
-
-def HasBufferFlatGlobalAtomicsF64 : // FIXME: Rename to show it's only for fadd
-  Predicate<"Subtarget->hasBufferFlatGlobalAtomicsF64()">,
-  // FIXME: This is too coarse, and working around using pseudo's predicates 
on real instruction.
-  AssemblerPredicate<(any_of FeatureGFX90AInsts, FeatureGFX10Insts, 
FeatureSouthernIslands, FeatureSeaIslands)>;
+def HasFlatBufferGlobalAtomicFaddF64Inst :
+  Predicate<"Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst()">,
+  AssemblerPredicate<(any_of FeatureFlatBufferGlobalAtomicFaddF64Inst)>;
 
 def HasAtomicFMinFMaxF32GlobalInsts :
   Predicate<"Subtarget->hasAtomicFMinFMaxF32GlobalInsts()">,
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td 
b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 3b8d94b744000..a904c8483dbf5 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -1312,14 +1312,16 @@ let SubtargetPredicate = isGFX90APlus in {
   }
 } // End SubtargetPredicate = isGFX90APlus
 
-let SubtargetPredicate = HasBufferFlatGlobalAtomicsF64 in {
+let SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst in {
   defm BUFFER_ATOMIC_ADD_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_add_f64", 
VReg_64, f64>;
+} // End SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst
 
+let SubtargetPredicate = HasAtomicFMinFMaxF64GlobalInsts in {
   // Note the names can be buffer_atomic_fmin_x2/buffer_atomic_fmax_x2
   // depending on some subtargets.
   defm BUFFER_ATOMIC_MIN_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_min_f64", 
VReg_64, f64>;
   defm BUFFER_ATOMIC_MAX_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_max_f64", 
VReg_64, f64>;
-} // End SubtargetPredicate = HasBufferFlatGlobalAtomicsF64
+}
 
 def BUFFER_INV : MUBUF_Invalidate<"buffer_inv"> {
   let SubtargetPredicate = isGFX940Plus;
@@ -1836,9 +1838,9 @@ let SubtargetPredicate = 
HasAtomicBufferGlobalPkAddF16Insts in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fadd", v2f16, 
"BUFFER_ATOMIC_PK_ADD_F16", ["ret"]>;
 } // End SubtargetPredicate = HasAtomicBufferGlobalPkAddF16Insts
 
-let SubtargetPredicate = HasBufferFlatGlobalAtomicsF64 in {
+let SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fadd", f64, 
"BUFFER_ATOMIC_ADD_F64">;
-} // End SubtargetPredicate = HasBufferFlatGlobalAtomicsF64
+} // End SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst
 
 let SubtargetPredicate = HasAtomicFMinFMaxF64GlobalInsts in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fmin", f64, 
"BUFFER_ATOMIC_MIN_F64">;
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td 
b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index 4bf8f20269a15..16dc019ede810 100644
--- 

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for global atomic fadd denormal support (PR #96443)

2024-07-04 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96443

>From b29740ba988e2286a7edc67b5a96c5dce0e600a6 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 16:44:08 +0200
Subject: [PATCH 1/3] AMDGPU: Add subtarget feature for global atomic fadd
 denormal support

Not sure what the behavior for gfx90a is. The SPG says it always flushes.
The instruction documentation says it does not.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 14 --
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  7 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 3f35db8883716..51c077598df74 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureMemoryAtomicFaddF32DenormalSupport
+  : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
+  "HasAtomicMemoryAtomicFaddF32DenormalSupport",
+  "true",
+  "global/flat/buffer atomic fadd for float supports denormal handling"
+>;
+
 def FeatureAgentScopeFineGrainedRemoteMemoryAtomics
   : SubtargetFeature<"agent-scope-fine-grained-remote-memory-atomics",
   "HasAgentScopeFineGrainedRemoteMemoryAtomics",
@@ -1427,7 +1434,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureKernargPreload,
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
-   FeatureAgentScopeFineGrainedRemoteMemoryAtomics
+   FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1631,7 +1639,9 @@ def FeatureISAVersion12 : FeatureSet<
FeatureScalarDwordx3Loads,
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
-   Feature1_5xVGPRs]>;
+   Feature1_5xVGPRs,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   ]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<
   !listconcat(FeatureISAVersion12.Features,
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h 
b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 9e2a316a9ed28..db0b2b67a0388 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -167,6 +167,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
   bool HasAtomicFlatPkAdd16Insts = false;
   bool HasAtomicFaddRtnInsts = false;
   bool HasAtomicFaddNoRtnInsts = false;
+  bool HasAtomicMemoryAtomicFaddF32DenormalSupport = false;
   bool HasAtomicBufferGlobalPkAddF16NoRtnInsts = false;
   bool HasAtomicBufferGlobalPkAddF16Insts = false;
   bool HasAtomicCSubNoRtnInsts = false;
@@ -872,6 +873,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
 
   bool hasFlatAtomicFaddF32Inst() const { return HasFlatAtomicFaddF32Inst; }
 
+  /// \return true if the target's flat, global, and buffer atomic fadd for
+  /// float supports denormal handling.
+  bool hasMemoryAtomicFaddF32DenormalSupport() const {
+return HasAtomicMemoryAtomicFaddF32DenormalSupport;
+  }
+
   /// \return true if atomic operations targeting fine-grained memory work
   /// correctly at device scope, in allocations in host or peer PCIe device
   /// memory.

>From 84819ef58eb12bd3606fbf250b516793dbf36add Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Jun 2024 12:10:37 +0200
Subject: [PATCH 2/3] Add to gfx11.

RDNA 3 manual says "Floating-point addition handles NAN/INF/denorm"
thought I'm not sure I trust it.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 51c077598df74..370992eb81ff3 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1547,7 +1547,8 @@ def FeatureISAVersion11_Common : FeatureSet<
FeatureFlatAtomicFaddF32Inst,
FeatureImageInsts,
FeaturePackedTID,
-   FeatureVcmpxPermlaneHazard]>;
+   FeatureVcmpxPermlaneHazard,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
 
 // There are few workarounds that need to be
 // added to all targets. This pessimizes codegen
@@ -1640,7 +1641,7 @@ def FeatureISAVersion12 : FeatureSet<
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
Feature1_5xVGPRs,
-   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<

>From 5ef29a5699514b2c3956d1f44b8a0393b7c87004 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 26 Jun 2024 11:30:51 +0200
Subject: [PATCH 3/3] Rename

---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 10 +-
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 370992eb81ff3..bea233bfb27bd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ 

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for global atomic fadd denormal support (PR #96443)

2024-07-04 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96443

>From b633b82d58ecc55bb88eaefd87d5b3030799f2c0 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 16:44:08 +0200
Subject: [PATCH 1/3] AMDGPU: Add subtarget feature for global atomic fadd
 denormal support

Not sure what the behavior for gfx90a is. The SPG says it always flushes.
The instruction documentation says it does not.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 14 --
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  7 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 3f35db8883716d..51c077598df749 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureMemoryAtomicFaddF32DenormalSupport
+  : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
+  "HasAtomicMemoryAtomicFaddF32DenormalSupport",
+  "true",
+  "global/flat/buffer atomic fadd for float supports denormal handling"
+>;
+
 def FeatureAgentScopeFineGrainedRemoteMemoryAtomics
   : SubtargetFeature<"agent-scope-fine-grained-remote-memory-atomics",
   "HasAgentScopeFineGrainedRemoteMemoryAtomics",
@@ -1427,7 +1434,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureKernargPreload,
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
-   FeatureAgentScopeFineGrainedRemoteMemoryAtomics
+   FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1631,7 +1639,9 @@ def FeatureISAVersion12 : FeatureSet<
FeatureScalarDwordx3Loads,
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
-   Feature1_5xVGPRs]>;
+   Feature1_5xVGPRs,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   ]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<
   !listconcat(FeatureISAVersion12.Features,
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h 
b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 9e2a316a9ed28d..db0b2b67a03884 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -167,6 +167,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
   bool HasAtomicFlatPkAdd16Insts = false;
   bool HasAtomicFaddRtnInsts = false;
   bool HasAtomicFaddNoRtnInsts = false;
+  bool HasAtomicMemoryAtomicFaddF32DenormalSupport = false;
   bool HasAtomicBufferGlobalPkAddF16NoRtnInsts = false;
   bool HasAtomicBufferGlobalPkAddF16Insts = false;
   bool HasAtomicCSubNoRtnInsts = false;
@@ -872,6 +873,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
 
   bool hasFlatAtomicFaddF32Inst() const { return HasFlatAtomicFaddF32Inst; }
 
+  /// \return true if the target's flat, global, and buffer atomic fadd for
+  /// float supports denormal handling.
+  bool hasMemoryAtomicFaddF32DenormalSupport() const {
+return HasAtomicMemoryAtomicFaddF32DenormalSupport;
+  }
+
   /// \return true if atomic operations targeting fine-grained memory work
   /// correctly at device scope, in allocations in host or peer PCIe device
   /// memory.

>From 55ee6de3f2ac99d981ed57c3f933cc18e9a738a2 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Jun 2024 12:10:37 +0200
Subject: [PATCH 2/3] Add to gfx11.

RDNA 3 manual says "Floating-point addition handles NAN/INF/denorm"
thought I'm not sure I trust it.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 51c077598df749..370992eb81ff33 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1547,7 +1547,8 @@ def FeatureISAVersion11_Common : FeatureSet<
FeatureFlatAtomicFaddF32Inst,
FeatureImageInsts,
FeaturePackedTID,
-   FeatureVcmpxPermlaneHazard]>;
+   FeatureVcmpxPermlaneHazard,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
 
 // There are few workarounds that need to be
 // added to all targets. This pessimizes codegen
@@ -1640,7 +1641,7 @@ def FeatureISAVersion12 : FeatureSet<
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
Feature1_5xVGPRs,
-   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<

>From 573e7bcdedc1fd56575f580d0fafd9e1ad9dbc96 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 26 Jun 2024 11:30:51 +0200
Subject: [PATCH 3/3] Rename

---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 10 +-
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 370992eb81ff33..bea233bfb27bd6 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ 

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits

wangpc-pp wrote:

Ping.
I'd like to push this forward because we don't take branch probabilities into 
consideration now.
Example: https://godbolt.org/z/doGhYadKM
We should use branches instead of selects in this case and this patch (the 
enabling of SelectOpt) will optimize this.
`clang -O3 -march=rv64gc_zba_zbb_zbc_zbs_zicond -Xclang -target-feature -Xclang 
+enable-select-opt -Xclang -target-feature -Xclang 
+predictable-select-expensive`
```
New Select group with
%. = select i1 %cmp, i32 5, i32 13, !prof !9
Analyzing select group containing   %. = select i1 %cmp, i32 5, i32 13, !prof !9
Converted to branch because of highly predictable branch.
```
```asm
func0:  # @func0
li  a2, 5
mul a1, a0, a0
bge a2, a0, .LBB0_2
addwa0, a1, a2
ret
.LBB0_2:# %select.false
li  a2, 13
addwa0, a1, a2
ret
```

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [RISCV][lld] Support merging RISC-V Atomics ABI attributes (PR #97347)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

MaskRay wrote:

> [RISCV][lld]  ...

I usually omit `[RISCV]` when the title already contains `RISC-V` or `RISCV`...

https://github.com/llvm/llvm-project/pull/97347
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [RISCV][lld] Support merging RISC-V Atomics ABI attributes (PR #97347)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/97347
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm] Reapply "[llvm][RISCV] Enable trailing fences for seq-cst stores by default (#87376)" (PR #90267)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/90267
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] 68b8f5f - Revert "[MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leadin…"

2024-07-03 Thread via llvm-branch-commits

Author: Han-Chung Wang
Date: 2024-07-03T16:02:17-07:00
New Revision: 68b8f5f684395f5057731f1dc67d27493d7660fa

URL: 
https://github.com/llvm/llvm-project/commit/68b8f5f684395f5057731f1dc67d27493d7660fa
DIFF: 
https://github.com/llvm/llvm-project/commit/68b8f5f684395f5057731f1dc67d27493d7660fa.diff

LOG: Revert "[MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non 
leadin…"

This reverts commit 2c06fb899966b49ff0fe4adf55fceb7d1941fbca.

Added: 


Modified: 
mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
mlir/test/Dialect/Vector/vector-transfer-flatten.mlir

Removed: 




diff  --git a/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp 
b/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
index c7d3022eff4d3..da5954b70a2ec 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
@@ -1622,27 +1622,7 @@ struct ChainedReduction final : 
OpRewritePattern {
   }
 };
 
-// Scalable unit dimensions are not supported. Folding such dimensions would
-// require "shifting" the scalable flag onto some other fixed-width dim (e.g.
-// vector<[1]x4xf32> -> vector<[4]xf32>). This could be implemented in the
-// future.
-static VectorType dropNonScalableUnitDimFromType(VectorType inVecTy) {
-  auto inVecShape = inVecTy.getShape();
-  SmallVector newShape;
-  SmallVector newScalableDims;
-  for (auto [dim, isScalable] :
-   llvm::zip_equal(inVecShape, inVecTy.getScalableDims())) {
-if (dim == 1 && !isScalable)
-  continue;
-
-newShape.push_back(dim);
-newScalableDims.push_back(isScalable);
-  }
-
-  return VectorType::get(newShape, inVecTy.getElementType(), newScalableDims);
-}
-
-/// For vectors with at least an unit dim, replaces:
+/// For vectors with either leading or trailing unit dim, replaces:
 ///   elementwise(a, b)
 /// with:
 ///   sc_a = shape_cast(a)
@@ -1654,16 +1634,20 @@ static VectorType 
dropNonScalableUnitDimFromType(VectorType inVecTy) {
 /// required to be rank > 1.
 ///
 /// Ex:
+/// ```
 ///  %mul = arith.mulf %B_row, %A_row : vector<1x[4]xf32>
 ///  %cast = vector.shape_cast %mul : vector<1x[4]xf32> to vector<[4]xf32>
+/// ```
 ///
 /// gets converted to:
 ///
+/// ```
 ///  %B_row_sc = vector.shape_cast %B_row : vector<1x[4]xf32> to 
vector<[4]xf32>
 ///  %A_row_sc = vector.shape_cast %A_row : vector<1x[4]xf32> to 
vector<[4]xf32>
 ///  %mul = arith.mulf %B_row_sc, %A_row_sc : vector<[4]xf32>
 ///  %cast_new = vector.shape_cast %mul : vector<[4]xf32> to vector<1x[4]xf32>
 ///  %cast = vector.shape_cast %cast_new : vector<1x[4]xf32> to vector<[4]xf32>
+/// ```
 ///
 /// Patterns for folding shape_casts should instantly eliminate `%cast_new` and
 /// `%cast`.
@@ -1683,29 +1667,42 @@ struct DropUnitDimFromElementwiseOps final
 // guaranteed to have identical shapes (with some exceptions such as
 // `arith.select`) and it suffices to only check one of them.
 auto sourceVectorType = dyn_cast(op->getOperand(0).getType());
-if (!sourceVectorType || sourceVectorType.getRank() < 2)
+if (!sourceVectorType)
+  return failure();
+if (sourceVectorType.getRank() < 2)
+  return failure();
+
+bool hasTrailingDimUnitFixed =
+((sourceVectorType.getShape().back() == 1) &&
+ (!sourceVectorType.getScalableDims().back()));
+bool hasLeadingDimUnitFixed =
+((sourceVectorType.getShape().front() == 1) &&
+ (!sourceVectorType.getScalableDims().front()));
+if (!hasLeadingDimUnitFixed && !hasTrailingDimUnitFixed)
   return failure();
 
+// Drop leading/trailing unit dim by applying vector.shape_cast to all
+// operands
+int64_t dim = hasLeadingDimUnitFixed ? 0 : sourceVectorType.getRank() - 1;
+
 SmallVector newOperands;
 auto loc = op->getLoc();
 for (auto operand : op->getOperands()) {
   auto opVectorType = cast(operand.getType());
-  auto newVType = dropNonScalableUnitDimFromType(opVectorType);
-  if (newVType == opVectorType)
-return rewriter.notifyMatchFailure(op, "No unit dimension to remove.");
-
+  VectorType newVType = VectorType::Builder(opVectorType).dropDim(dim);
   auto opSC = rewriter.create(loc, newVType, operand);
   newOperands.push_back(opSC);
 }
 
 VectorType newResultVectorType =
-dropNonScalableUnitDimFromType(resultVectorType);
-// Create an updated elementwise Op without unit dim.
+VectorType::Builder(resultVectorType).dropDim(dim);
+// Create an updated elementwise Op without leading/trailing unit dim
 Operation *elementwiseOp =
 rewriter.create(loc, op->getName().getIdentifier(), newOperands,
 newResultVectorType, op->getAttrs());
 
-// Restore the unit dim by applying vector.shape_cast to the result.
+// Restore the leading/trailing unit dim by applying vector.shape_cast
+// 

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for memory atomic fadd f64 (PR #96444)

2024-07-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96444

>From 308e31175185edc0d1aba78653b137c6a6f53a0e Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 17:07:53 +0200
Subject: [PATCH] AMDGPU: Add subtarget feature for memory atomic fadd f64

---
 llvm/lib/Target/AMDGPU/AMDGPU.td   | 21 ++---
 llvm/lib/Target/AMDGPU/BUFInstructions.td  | 10 ++
 llvm/lib/Target/AMDGPU/FLATInstructions.td |  6 +++---
 llvm/lib/Target/AMDGPU/GCNSubtarget.h  | 10 +++---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp  |  2 +-
 5 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index bea233bfb27bd..94e8e77b3c052 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureFlatBufferGlobalAtomicFaddF64Inst
+  : SubtargetFeature<"flat-buffer-global-fadd-f64-inst",
+  "HasFlatBufferGlobalAtomicFaddF64Inst",
+  "true",
+  "Has flat, buffer, and global instructions for f64 atomic fadd"
+>;
+
 def FeatureMemoryAtomicFAddF32DenormalSupport
   : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
   "HasMemoryAtomicFaddF32DenormalSupport",
@@ -1390,7 +1397,8 @@ def FeatureISAVersion9_0_A : FeatureSet<
  FeatureBackOffBarrier,
  FeatureKernargPreload,
  FeatureAtomicFMinFMaxF64GlobalInsts,
- FeatureAtomicFMinFMaxF64FlatInsts
+ FeatureAtomicFMinFMaxF64FlatInsts,
+ FeatureFlatBufferGlobalAtomicFaddF64Inst
  ])>;
 
 def FeatureISAVersion9_0_C : FeatureSet<
@@ -1435,7 +1443,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
-   FeatureMemoryAtomicFAddF32DenormalSupport
+   FeatureMemoryAtomicFAddF32DenormalSupport,
+   FeatureFlatBufferGlobalAtomicFaddF64Inst
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1932,11 +1941,9 @@ def isGFX12Plus :
 def HasFlatAddressSpace : Predicate<"Subtarget->hasFlatAddressSpace()">,
   AssemblerPredicate<(all_of FeatureFlatAddressSpace)>;
 
-
-def HasBufferFlatGlobalAtomicsF64 : // FIXME: Rename to show it's only for fadd
-  Predicate<"Subtarget->hasBufferFlatGlobalAtomicsF64()">,
-  // FIXME: This is too coarse, and working around using pseudo's predicates 
on real instruction.
-  AssemblerPredicate<(any_of FeatureGFX90AInsts, FeatureGFX10Insts, 
FeatureSouthernIslands, FeatureSeaIslands)>;
+def HasFlatBufferGlobalAtomicFaddF64Inst :
+  Predicate<"Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst()">,
+  AssemblerPredicate<(any_of FeatureFlatBufferGlobalAtomicFaddF64Inst)>;
 
 def HasAtomicFMinFMaxF32GlobalInsts :
   Predicate<"Subtarget->hasAtomicFMinFMaxF32GlobalInsts()">,
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td 
b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 3b8d94b744000..a904c8483dbf5 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -1312,14 +1312,16 @@ let SubtargetPredicate = isGFX90APlus in {
   }
 } // End SubtargetPredicate = isGFX90APlus
 
-let SubtargetPredicate = HasBufferFlatGlobalAtomicsF64 in {
+let SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst in {
   defm BUFFER_ATOMIC_ADD_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_add_f64", 
VReg_64, f64>;
+} // End SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst
 
+let SubtargetPredicate = HasAtomicFMinFMaxF64GlobalInsts in {
   // Note the names can be buffer_atomic_fmin_x2/buffer_atomic_fmax_x2
   // depending on some subtargets.
   defm BUFFER_ATOMIC_MIN_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_min_f64", 
VReg_64, f64>;
   defm BUFFER_ATOMIC_MAX_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_max_f64", 
VReg_64, f64>;
-} // End SubtargetPredicate = HasBufferFlatGlobalAtomicsF64
+}
 
 def BUFFER_INV : MUBUF_Invalidate<"buffer_inv"> {
   let SubtargetPredicate = isGFX940Plus;
@@ -1836,9 +1838,9 @@ let SubtargetPredicate = 
HasAtomicBufferGlobalPkAddF16Insts in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fadd", v2f16, 
"BUFFER_ATOMIC_PK_ADD_F16", ["ret"]>;
 } // End SubtargetPredicate = HasAtomicBufferGlobalPkAddF16Insts
 
-let SubtargetPredicate = HasBufferFlatGlobalAtomicsF64 in {
+let SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fadd", f64, 
"BUFFER_ATOMIC_ADD_F64">;
-} // End SubtargetPredicate = HasBufferFlatGlobalAtomicsF64
+} // End SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst
 
 let SubtargetPredicate = HasAtomicFMinFMaxF64GlobalInsts in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fmin", f64, 
"BUFFER_ATOMIC_MIN_F64">;
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td 
b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index 4bf8f20269a15..16dc019ede810 100644
--- 

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for global atomic fadd denormal support (PR #96443)

2024-07-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96443

>From 637bb436aa8472c2380364e573219c2a7524fdb1 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 16:44:08 +0200
Subject: [PATCH 1/3] AMDGPU: Add subtarget feature for global atomic fadd
 denormal support

Not sure what the behavior for gfx90a is. The SPG says it always flushes.
The instruction documentation says it does not.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 14 --
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  7 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 3f35db8883716..51c077598df74 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureMemoryAtomicFaddF32DenormalSupport
+  : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
+  "HasAtomicMemoryAtomicFaddF32DenormalSupport",
+  "true",
+  "global/flat/buffer atomic fadd for float supports denormal handling"
+>;
+
 def FeatureAgentScopeFineGrainedRemoteMemoryAtomics
   : SubtargetFeature<"agent-scope-fine-grained-remote-memory-atomics",
   "HasAgentScopeFineGrainedRemoteMemoryAtomics",
@@ -1427,7 +1434,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureKernargPreload,
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
-   FeatureAgentScopeFineGrainedRemoteMemoryAtomics
+   FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1631,7 +1639,9 @@ def FeatureISAVersion12 : FeatureSet<
FeatureScalarDwordx3Loads,
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
-   Feature1_5xVGPRs]>;
+   Feature1_5xVGPRs,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   ]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<
   !listconcat(FeatureISAVersion12.Features,
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h 
b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 9e2a316a9ed28..db0b2b67a0388 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -167,6 +167,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
   bool HasAtomicFlatPkAdd16Insts = false;
   bool HasAtomicFaddRtnInsts = false;
   bool HasAtomicFaddNoRtnInsts = false;
+  bool HasAtomicMemoryAtomicFaddF32DenormalSupport = false;
   bool HasAtomicBufferGlobalPkAddF16NoRtnInsts = false;
   bool HasAtomicBufferGlobalPkAddF16Insts = false;
   bool HasAtomicCSubNoRtnInsts = false;
@@ -872,6 +873,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
 
   bool hasFlatAtomicFaddF32Inst() const { return HasFlatAtomicFaddF32Inst; }
 
+  /// \return true if the target's flat, global, and buffer atomic fadd for
+  /// float supports denormal handling.
+  bool hasMemoryAtomicFaddF32DenormalSupport() const {
+return HasAtomicMemoryAtomicFaddF32DenormalSupport;
+  }
+
   /// \return true if atomic operations targeting fine-grained memory work
   /// correctly at device scope, in allocations in host or peer PCIe device
   /// memory.

>From d954785fffda502d8325cca1ffb6a0adc15dc54a Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Jun 2024 12:10:37 +0200
Subject: [PATCH 2/3] Add to gfx11.

RDNA 3 manual says "Floating-point addition handles NAN/INF/denorm"
thought I'm not sure I trust it.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 51c077598df74..370992eb81ff3 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1547,7 +1547,8 @@ def FeatureISAVersion11_Common : FeatureSet<
FeatureFlatAtomicFaddF32Inst,
FeatureImageInsts,
FeaturePackedTID,
-   FeatureVcmpxPermlaneHazard]>;
+   FeatureVcmpxPermlaneHazard,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
 
 // There are few workarounds that need to be
 // added to all targets. This pessimizes codegen
@@ -1640,7 +1641,7 @@ def FeatureISAVersion12 : FeatureSet<
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
Feature1_5xVGPRs,
-   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<

>From deebca23726296fff2892f9c780e3049db64749a Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 26 Jun 2024 11:30:51 +0200
Subject: [PATCH 3/3] Rename

---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 10 +-
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 370992eb81ff3..bea233bfb27bd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ 

[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Maksim Panchenko via llvm-branch-commits

maksfb wrote:

Could you please reword the summary and add an example where the new matching 
technique helps.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Maksim Panchenko via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext ) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";
+  break;
+case HashFunction::XXH3:
+  outs() << "xxh3\n";
+  break;
+}
+  }
+  YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);
+
+  // Computes hash for binary functions.
+  if (opts::MatchProfileWithFunctionHash) {
+for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  BF.computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+}
+  } else if (!opts::IgnoreHash) {
+for (BinaryFunction *BF : ProfileBFs) {
+  if (!BF)
+continue;
+  BF->computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+}
+  }
+
+  size_t MatchedWithExactName = matchWithExactName();
+  size_t MatchedWithHash = matchWithHash(BC);
+  size_t MatchedWithLTOCommonName = matchWithLTOCommonName();

maksfb wrote:

nit: make them `const`.

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/97521

>From 9bedda3fa950fbb418a53945f6e36da9a7582e3b Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Wed, 3 Jul 2024 11:45:26 -0700
Subject: [PATCH] fix header

Created using spr 1.3.5-bogner
---
 llvm/include/llvm/ADT/bit.h| 1 -
 llvm/include/llvm/MC/MCELFExtras.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/include/llvm/ADT/bit.h b/llvm/include/llvm/ADT/bit.h
index 1c8bd46648256..c42b5e686bdc9 100644
--- a/llvm/include/llvm/ADT/bit.h
+++ b/llvm/include/llvm/ADT/bit.h
@@ -14,7 +14,6 @@
 #ifndef LLVM_ADT_BIT_H
 #define LLVM_ADT_BIT_H
 
-#include "llvm/ADT/bit.h"
 #include "llvm/Support/Compiler.h"
 #include 
 #include 
diff --git a/llvm/include/llvm/MC/MCELFExtras.h 
b/llvm/include/llvm/MC/MCELFExtras.h
index 0f0c10edca2cf..498d477fbedc4 100644
--- a/llvm/include/llvm/MC/MCELFExtras.h
+++ b/llvm/include/llvm/MC/MCELFExtras.h
@@ -10,6 +10,7 @@
 #define LLVM_MC_MCELFEXTRAS_H
 
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/bit.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/Support/LEB128.h"
 #include "llvm/Support/raw_ostream.h"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/97521


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/97521


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

MaskRay wrote:

> Not that the patch is especially long/complicated, but could be split into 
> the refactor/move of the MC function, then the new usage, if you like (usual 
> reasons - smaller patches are easier to root cause, functionality can be 
> reverted without thrashing the refactored code (or refactored code can be 
> reverted if issues are found in that before the usage goes in), etc)

The body of `encodeCrel` is a simple move. Even with a signature change, the 
two parts (extract and adapt assembler + support llvm-objcopy) could still be 
considered separate.
However, some reviewers might prefer seeing both parts together for a better 
understanding of the extracted API.

Based on the comments from jh7370 and smithp35, the extraction seems reasonable.
**How about I landing the extraction part separately after receiving official 
feedback?
I will then rebase this llvm-objcopy patch.**

(I maintain patches in a stack and ensure the final one 
https://github.com/MaskRay/llvm-project/commits/demo-crel/ passes a local 
integration test.
There is some process inconvenience given that the llvm-objdump PR also 
modifies Object and has been approved yet...)


https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/8] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/8] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/8] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/8] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/96596

>From 05d59574d6260b98a469921eb2fccf5398bfafb6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Mon, 24 Jun 2024 23:00:59 -0700
Subject: [PATCH 01/14] Added call to matchWithCallsAsAnchors

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index aafffac3d4b1c..1a0e5d239d252 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -479,6 +479,9 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  uint64_t MatchedWithCallsAsAnchors = 0;
+  matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 77ef0008f4f5987719555e6cc3e32da812ae0f31 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Mon, 24 Jun 2024 23:11:43 -0700
Subject: [PATCH 02/14] Changed CallHashToBF representation

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 1a0e5d239d252..91b01a99c7485 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -29,6 +29,10 @@ static llvm::cl::opt
cl::desc("ignore hash while reading function profile"),
cl::Hidden, cl::cat(BoltOptCategory));
 
+llvm::cl::opt MatchWithCallsAsAnchors("match-with-calls-as-anchors",
+  cl::desc("Matches with calls as anchors"),
+  cl::Hidden, cl::cat(BoltOptCategory));
+
 llvm::cl::opt ProfileUseDFS("profile-use-dfs",
   cl::desc("use DFS order for YAML profile"),
   cl::Hidden, cl::cat(BoltOptCategory));
@@ -353,7 +357,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 llvm_unreachable("Unhandled HashFunction");
   };
 
-  std::unordered_map CallHashToBF;
+  std::unordered_map CallHashToBF;
 
   for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
 if (ProfiledFunctions.count(BF))
@@ -375,12 +379,12 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
   for (const std::string  : FunctionNames)
 HashString.append(FunctionName);
 }
-CallHashToBF.emplace(ComputeCallHash(HashString), BF);
+CallHashToBF[ComputeCallHash(HashString)] = BF;
   }
 
   std::unordered_map ProfiledFunctionIdToName;
 
-  for (const yaml::bolt::BinaryFunctionProfile YamlBF : YamlBP.Functions)
+  for (const yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 ProfiledFunctionIdToName[YamlBF.Id] = YamlBF.Name;
 
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions) {
@@ -401,7 +405,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 auto It = CallHashToBF.find(Hash);
 if (It == CallHashToBF.end())
   continue;
-matchProfileToFunction(YamlBF, It->second);
+matchProfileToFunction(YamlBF, *It->second);
 ++MatchedWithCallsAsAnchors;
   }
 }
@@ -480,7 +484,8 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   matchProfileToFunction(YamlBF, *BF);
 
   uint64_t MatchedWithCallsAsAnchors = 0;
-  matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
+  if (opts::MatchWithCallsAsAnchors)
+matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
 
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)

>From ea7cb68ab9e8e158412c2e752986968968a60d93 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Tue, 25 Jun 2024 09:28:39 -0700
Subject: [PATCH 03/14] Changed BF called FunctionNames to multiset

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 91b01a99c7485..3b3d73f7af023 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -365,7 +365,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 
 std::string HashString;
 for (const auto  : BF->blocks()) {
-  std::set FunctionNames;
+  std::multiset FunctionNames;
   for (const MCInst  : BB) {
 // Skip non-call instructions.
 if (!BC.MIB->isCall(Instr))
@@ -397,9 +397,8 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 std::string  = ProfiledFunctionIdToName[CallSite.DestId];
 FunctionNames.insert(FunctionName);
   }
-  for (const std::string  : FunctionNames) {
+  for 

[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -220,17 +245,27 @@ class StaleMatcher {
 return BestBlock;
   }
 
-  /// Returns true if the two basic blocks (in the binary and in the profile)
-  /// corresponding to the given hashes are matched to each other with a high
-  /// confidence.
-  static bool isHighConfidenceMatch(BlendedBlockHash Hash1,
-BlendedBlockHash Hash2) {
-return Hash1.InstrHash == Hash2.InstrHash;
+  // Uses CallHash to find the most similar block for a given hash.
+  const FlowBlock *matchWithCalls(BlendedBlockHash ,

aaupov wrote:

ditto

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -193,18 +193,43 @@ class StaleMatcher {
 public:
   /// Initialize stale matcher.
   void init(const std::vector ,
-const std::vector ) {
+const std::vector ,
+const std::vector ) {
 assert(Blocks.size() == Hashes.size() &&
+   Hashes.size() == CallHashes.size() &&
"incorrect matcher initialization");
 for (size_t I = 0; I < Blocks.size(); I++) {
   FlowBlock *Block = Blocks[I];
   uint16_t OpHash = Hashes[I].OpcodeHash;
   OpHashToBlocks[OpHash].push_back(std::make_pair(Hashes[I], Block));
+  if (CallHashes[I])
+CallHashToBlocks[CallHashes[I]].push_back(
+std::make_pair(Hashes[I], Block));
 }
   }
 
   /// Find the most similar block for a given hash.
-  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash) const {
+  const FlowBlock *matchBlock(BlendedBlockHash ,
+  uint64_t ) const {
+const FlowBlock *BestBlock = matchWithOpcodes(BlendedHash);
+return BestBlock ? BestBlock : matchWithCalls(BlendedHash, CallHash);
+  }
+
+  /// Returns true if the two basic blocks (in the binary and in the profile)
+  /// corresponding to the given hashes are matched to each other with a high
+  /// confidence.
+  static bool isHighConfidenceMatch(BlendedBlockHash Hash1,
+BlendedBlockHash Hash2) {
+return Hash1.InstrHash == Hash2.InstrHash;
+  }
+
+private:
+  using HashBlockPairType = std::pair;
+  std::unordered_map> OpHashToBlocks;
+  std::unordered_map> 
CallHashToBlocks;
+
+  // Uses OpcodeHash to find the most similar block for a given hash.
+  const FlowBlock *matchWithOpcodes(BlendedBlockHash ) const {

aaupov wrote:

ditto

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -193,18 +193,43 @@ class StaleMatcher {
 public:
   /// Initialize stale matcher.
   void init(const std::vector ,
-const std::vector ) {
+const std::vector ,
+const std::vector ) {
 assert(Blocks.size() == Hashes.size() &&
+   Hashes.size() == CallHashes.size() &&
"incorrect matcher initialization");
 for (size_t I = 0; I < Blocks.size(); I++) {
   FlowBlock *Block = Blocks[I];
   uint16_t OpHash = Hashes[I].OpcodeHash;
   OpHashToBlocks[OpHash].push_back(std::make_pair(Hashes[I], Block));
+  if (CallHashes[I])
+CallHashToBlocks[CallHashes[I]].push_back(
+std::make_pair(Hashes[I], Block));
 }
   }
 
   /// Find the most similar block for a given hash.
-  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash) const {
+  const FlowBlock *matchBlock(BlendedBlockHash ,
+  uint64_t ) const {

aaupov wrote:

```suggestion
  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash,
  uint64_t CallHash) const {
```
BlendedBlockHash is aliased to uint64_t, and integral types should be passed by 
value.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -412,33 +447,62 @@ createFlowFunction(const 
BinaryFunction::BasicBlockOrderType ) {
 /// of the basic blocks in the binary, the count is "matched" to the block.
 /// Similarly, if both the source and the target of a count in the profile are
 /// matched to a jump in the binary, the count is recorded in CFG.
-size_t matchWeightsByHashes(
-BinaryContext , const BinaryFunction::BasicBlockOrderType ,
-const yaml::bolt::BinaryFunctionProfile , FlowFunction ) {
+size_t
+matchWeightsByHashes(BinaryContext ,
+ const DenseMap ,
+ const BinaryFunction::BasicBlockOrderType ,
+ const yaml::bolt::BinaryFunctionProfile ,
+ FlowFunction , HashFunction HashFunction) {

aaupov wrote:

```suggestion
size_t
matchWeightsByHashes(BinaryContext ,
 const BinaryFunction::BasicBlockOrderType ,
 const yaml::bolt::BinaryFunctionProfile ,
 FlowFunction , HashFunction HashFunction,
 const DenseMap ) 
{
```
It's customary to add new parameters to the end

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Sorry, couple of final comments

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

MaskRay wrote:

> [jh7370](https://github.com/jh7370) I've skimmed briefly and the changes look 
> reasonable - will look more in depth on a separate occasion when I have more 
> time.

Thanks!

> Not for this PR, but I wonder if there would be some benefit in a 
> `--decode-crel` and/or `--encode-crel` option that would convert an object 
> file to/from using CREL. I feel like this might be useful for 
> experimentation, or for handling the case where an object was generated with 
> CREL but needs to be usable by an older tool that doesn't understand CREL. 
> Equally, it could be useful for retroactively encoding CREL when the feature 
> wasn't used during original creation of the object. Thoughts?

Agreed that the `CREL => RELA` conversion will be useful to make CREL better 
interchange format - allow old linkers to build new relocatable files and allow 
other tools for analysis tasks.
That is probably a long-term goal.

In the short-term I aim for providing a complete toolchain 
(assembler,linker,objcopy/strip,objdump) for the most important use case 
(compile + assemble + (strip)? + link).

> [smithp35](https://github.com/smithp35) Only a couple of small comments from 
> me. I'll be out of the office till Monday next week, I'm fine for others to 
> progress this wihout me.

Thanks! Take your time.


https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext ) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";

aaupov wrote:

@ayermolo, we didn't switch Profile component to BC logger class. That would be 
a separate effort.

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits


@@ -1861,7 +1886,15 @@ template  Error 
ELFBuilder::readSections(bool EnsureSymtab) {
 
   const typename ELFFile::Elf_Shdr *Shdr =
   Sections->begin() + RelSec->Index;
-  if (RelSec->Type == SHT_REL) {
+  if (RelSec->Type == SHT_CREL) {
+auto Rels = ElfFile.crels(*Shdr);

MaskRay wrote:

Agreed. Renamed to `RelsOrRelas`

https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Fangrui Song via llvm-branch-commits


@@ -0,0 +1,60 @@
+//===- MCELFExtras.h - Extra functions for ELF --*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_MC_MCELFEXTRAS_H
+#define LLVM_MC_MCELFEXTRAS_H
+
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/BinaryFormat/ELF.h"
+#include "llvm/Support/LEB128.h"
+#include "llvm/Support/raw_ostream.h"
+
+#include 
+#include 
+
+namespace llvm::ELF {

MaskRay wrote:

Thanks for the comment suggestion. Added.

https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objdump] -r: support CREL (PR #97382)

2024-07-03 Thread Fangrui Song via llvm-branch-commits


@@ -207,6 +209,43 @@ bool isSectionInSegment(const typename ELFT::Phdr ,
  checkSectionVMA(Phdr, Sec);
 }
 
+template 
+Error decodeCrel(ArrayRef Content,
+ function_ref HdrHandler,

MaskRay wrote:

thx for the suggestion. adopted

https://github.com/llvm/llvm-project/pull/97382
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objdump] -r: support CREL (PR #97382)

2024-07-03 Thread Fangrui Song via llvm-branch-commits


@@ -207,6 +209,43 @@ bool isSectionInSegment(const typename ELFT::Phdr ,
  checkSectionVMA(Phdr, Sec);
 }
 
+template 

MaskRay wrote:

thx for the suggestion. adopted

https://github.com/llvm/llvm-project/pull/97382
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objdump] -r: support CREL (PR #97382)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/97382


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objdump] -r: support CREL (PR #97382)

2024-07-03 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/97382


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/96596

>From 05d59574d6260b98a469921eb2fccf5398bfafb6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Mon, 24 Jun 2024 23:00:59 -0700
Subject: [PATCH 01/13] Added call to matchWithCallsAsAnchors

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index aafffac3d4b1c..1a0e5d239d252 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -479,6 +479,9 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  uint64_t MatchedWithCallsAsAnchors = 0;
+  matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 77ef0008f4f5987719555e6cc3e32da812ae0f31 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Mon, 24 Jun 2024 23:11:43 -0700
Subject: [PATCH 02/13] Changed CallHashToBF representation

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 1a0e5d239d252..91b01a99c7485 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -29,6 +29,10 @@ static llvm::cl::opt
cl::desc("ignore hash while reading function profile"),
cl::Hidden, cl::cat(BoltOptCategory));
 
+llvm::cl::opt MatchWithCallsAsAnchors("match-with-calls-as-anchors",
+  cl::desc("Matches with calls as anchors"),
+  cl::Hidden, cl::cat(BoltOptCategory));
+
 llvm::cl::opt ProfileUseDFS("profile-use-dfs",
   cl::desc("use DFS order for YAML profile"),
   cl::Hidden, cl::cat(BoltOptCategory));
@@ -353,7 +357,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 llvm_unreachable("Unhandled HashFunction");
   };
 
-  std::unordered_map CallHashToBF;
+  std::unordered_map CallHashToBF;
 
   for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
 if (ProfiledFunctions.count(BF))
@@ -375,12 +379,12 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
   for (const std::string  : FunctionNames)
 HashString.append(FunctionName);
 }
-CallHashToBF.emplace(ComputeCallHash(HashString), BF);
+CallHashToBF[ComputeCallHash(HashString)] = BF;
   }
 
   std::unordered_map ProfiledFunctionIdToName;
 
-  for (const yaml::bolt::BinaryFunctionProfile YamlBF : YamlBP.Functions)
+  for (const yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 ProfiledFunctionIdToName[YamlBF.Id] = YamlBF.Name;
 
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions) {
@@ -401,7 +405,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 auto It = CallHashToBF.find(Hash);
 if (It == CallHashToBF.end())
   continue;
-matchProfileToFunction(YamlBF, It->second);
+matchProfileToFunction(YamlBF, *It->second);
 ++MatchedWithCallsAsAnchors;
   }
 }
@@ -480,7 +484,8 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   matchProfileToFunction(YamlBF, *BF);
 
   uint64_t MatchedWithCallsAsAnchors = 0;
-  matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
+  if (opts::MatchWithCallsAsAnchors)
+matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
 
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)

>From ea7cb68ab9e8e158412c2e752986968968a60d93 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Tue, 25 Jun 2024 09:28:39 -0700
Subject: [PATCH 03/13] Changed BF called FunctionNames to multiset

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 91b01a99c7485..3b3d73f7af023 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -365,7 +365,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 
 std::string HashString;
 for (const auto  : BF->blocks()) {
-  std::set FunctionNames;
+  std::multiset FunctionNames;
   for (const MCInst  : BB) {
 // Skip non-call instructions.
 if (!BC.MIB->isCall(Instr))
@@ -397,9 +397,8 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 std::string  = ProfiledFunctionIdToName[CallSite.DestId];
 FunctionNames.insert(FunctionName);
   }
-  for (const std::string  : FunctionNames) {
+  for 

[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/96596

>From 05d59574d6260b98a469921eb2fccf5398bfafb6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Mon, 24 Jun 2024 23:00:59 -0700
Subject: [PATCH 01/13] Added call to matchWithCallsAsAnchors

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index aafffac3d4b1c..1a0e5d239d252 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -479,6 +479,9 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  uint64_t MatchedWithCallsAsAnchors = 0;
+  matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 77ef0008f4f5987719555e6cc3e32da812ae0f31 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Mon, 24 Jun 2024 23:11:43 -0700
Subject: [PATCH 02/13] Changed CallHashToBF representation

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 1a0e5d239d252..91b01a99c7485 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -29,6 +29,10 @@ static llvm::cl::opt
cl::desc("ignore hash while reading function profile"),
cl::Hidden, cl::cat(BoltOptCategory));
 
+llvm::cl::opt MatchWithCallsAsAnchors("match-with-calls-as-anchors",
+  cl::desc("Matches with calls as anchors"),
+  cl::Hidden, cl::cat(BoltOptCategory));
+
 llvm::cl::opt ProfileUseDFS("profile-use-dfs",
   cl::desc("use DFS order for YAML profile"),
   cl::Hidden, cl::cat(BoltOptCategory));
@@ -353,7 +357,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 llvm_unreachable("Unhandled HashFunction");
   };
 
-  std::unordered_map CallHashToBF;
+  std::unordered_map CallHashToBF;
 
   for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
 if (ProfiledFunctions.count(BF))
@@ -375,12 +379,12 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
   for (const std::string  : FunctionNames)
 HashString.append(FunctionName);
 }
-CallHashToBF.emplace(ComputeCallHash(HashString), BF);
+CallHashToBF[ComputeCallHash(HashString)] = BF;
   }
 
   std::unordered_map ProfiledFunctionIdToName;
 
-  for (const yaml::bolt::BinaryFunctionProfile YamlBF : YamlBP.Functions)
+  for (const yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 ProfiledFunctionIdToName[YamlBF.Id] = YamlBF.Name;
 
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions) {
@@ -401,7 +405,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 auto It = CallHashToBF.find(Hash);
 if (It == CallHashToBF.end())
   continue;
-matchProfileToFunction(YamlBF, It->second);
+matchProfileToFunction(YamlBF, *It->second);
 ++MatchedWithCallsAsAnchors;
   }
 }
@@ -480,7 +484,8 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   matchProfileToFunction(YamlBF, *BF);
 
   uint64_t MatchedWithCallsAsAnchors = 0;
-  matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
+  if (opts::MatchWithCallsAsAnchors)
+matchWithCallsAsAnchors(BC,  MatchedWithCallsAsAnchors);
 
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)

>From ea7cb68ab9e8e158412c2e752986968968a60d93 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Tue, 25 Jun 2024 09:28:39 -0700
Subject: [PATCH 03/13] Changed BF called FunctionNames to multiset

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 91b01a99c7485..3b3d73f7af023 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -365,7 +365,7 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 
 std::string HashString;
 for (const auto  : BF->blocks()) {
-  std::set FunctionNames;
+  std::multiset FunctionNames;
   for (const MCInst  : BB) {
 // Skip non-call instructions.
 if (!BC.MIB->isCall(Instr))
@@ -397,9 +397,8 @@ void YAMLProfileReader::matchWithCallsAsAnchors(
 std::string  = ProfiledFunctionIdToName[CallSite.DestId];
 FunctionNames.insert(FunctionName);
   }
-  for (const std::string  : FunctionNames) {
+  for 

[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext ) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";

shawbyoung wrote:

I'm erring on the side of making minimal code change - although it's showing up 
on gh as code added, I haven't touched the prologue of readProfile. If you see 
the large "deleted" section above (starting on the original line 353 of 
YAMLProfileReader.cpp) it's the exact same. So, I'd like to keep this PR just 
about refactoring function matching.

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung edited 
https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext ) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";
+  break;
+case HashFunction::XXH3:
+  outs() << "xxh3\n";
+  break;
+}
+  }
+  YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);
+
+  // Computes hash for binary functions.
+  if (opts::MatchProfileWithFunctionHash) {
+for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  BF.computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+}
+  } else if (!opts::IgnoreHash) {
+for (BinaryFunction *BF : ProfileBFs) {
+  if (!BF)
+continue;
+  BF->computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+}
+  }
+
+  size_t MatchedWithExactName = matchWithExactName();

shawbyoung wrote:

In lines 481 - 487 
> if (opts::Verbosity >= 1) {
>outs() << "BOLT-INFO: matched " << MatchedWithExactName
>   << " functions with identical names\n";
> ...
Match counts are directed to outs()

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Alexander Yermolovich via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext ) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";

ayermolo wrote:

BC.outs()

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Alexander Yermolovich via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext ) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";
+  break;
+case HashFunction::XXH3:
+  outs() << "xxh3\n";
+  break;
+}
+  }
+  YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);
+
+  // Computes hash for binary functions.
+  if (opts::MatchProfileWithFunctionHash) {
+for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  BF.computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+}
+  } else if (!opts::IgnoreHash) {
+for (BinaryFunction *BF : ProfileBFs) {
+  if (!BF)
+continue;
+  BF->computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+}
+  }
+
+  size_t MatchedWithExactName = matchWithExactName();

ayermolo wrote:

This doesn't look like it's used anywhere?

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/97502

>From c6212e4b26b0f0d8abde323fa5fc04ecc6dd34fd Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 3 Jul 2024 09:45:46 -0700
Subject: [PATCH 1/2] Changed profileMatches comment

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h | 2 +-
 bolt/lib/Profile/YAMLProfileReader.cpp| 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index a5bd3544bd999..627cebf5d9453 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -73,7 +73,7 @@ class YAMLProfileReader : public ProfileReaderBase {
   bool parseFunctionProfile(BinaryFunction ,
 const yaml::bolt::BinaryFunctionProfile );
 
-  /// Returns block cnt equality if IgnoreHash is true, otherwise, hash 
equality
+  /// Checks if a function profile matches a binary function.
   bool profileMatches(const yaml::bolt::BinaryFunctionProfile ,
   BinaryFunction );
 
diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index e8ce187367899..91628d950e9f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -333,6 +333,7 @@ Error YAMLProfileReader::preprocessProfile(BinaryContext 
) {
 
   return Error::success();
 }
+
 bool YAMLProfileReader::profileMatches(
 const yaml::bolt::BinaryFunctionProfile , BinaryFunction ) {
   if (opts::IgnoreHash)

>From 1f48f09228b54e410910c2186cf0c3a73400bfd3 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 3 Jul 2024 10:26:27 -0700
Subject: [PATCH 2/2] Changing profileMatches BF param to const

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index 627cebf5d9453..fe9f349de278d 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -75,7 +75,7 @@ class YAMLProfileReader : public ProfileReaderBase {
 
   /// Checks if a function profile matches a binary function.
   bool profileMatches(const yaml::bolt::BinaryFunctionProfile ,
-  BinaryFunction );
+  const BinaryFunction );
 
   /// Infer function profile from stale data (collected on older binaries).
   bool inferStaleProfile(BinaryFunction ,

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LG % nit

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -334,6 +334,13 @@ Error YAMLProfileReader::preprocessProfile(BinaryContext 
) {
   return Error::success();
 }
 
+bool YAMLProfileReader::profileMatches(
+const yaml::bolt::BinaryFunctionProfile , BinaryFunction ) {

aaupov wrote:

```suggestion
const yaml::bolt::BinaryFunctionProfile , const BinaryFunction ) 
{
```

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for memory atomic fadd f64 (PR #96444)

2024-07-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96444

>From 5945915a9a9f0caf3ed890ce450a25cff58ef608 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 17:07:53 +0200
Subject: [PATCH] AMDGPU: Add subtarget feature for memory atomic fadd f64

---
 llvm/lib/Target/AMDGPU/AMDGPU.td   | 21 ++---
 llvm/lib/Target/AMDGPU/BUFInstructions.td  | 10 ++
 llvm/lib/Target/AMDGPU/FLATInstructions.td |  6 +++---
 llvm/lib/Target/AMDGPU/GCNSubtarget.h  | 10 +++---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp  |  2 +-
 5 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index bea233bfb27bd..94e8e77b3c052 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureFlatBufferGlobalAtomicFaddF64Inst
+  : SubtargetFeature<"flat-buffer-global-fadd-f64-inst",
+  "HasFlatBufferGlobalAtomicFaddF64Inst",
+  "true",
+  "Has flat, buffer, and global instructions for f64 atomic fadd"
+>;
+
 def FeatureMemoryAtomicFAddF32DenormalSupport
   : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
   "HasMemoryAtomicFaddF32DenormalSupport",
@@ -1390,7 +1397,8 @@ def FeatureISAVersion9_0_A : FeatureSet<
  FeatureBackOffBarrier,
  FeatureKernargPreload,
  FeatureAtomicFMinFMaxF64GlobalInsts,
- FeatureAtomicFMinFMaxF64FlatInsts
+ FeatureAtomicFMinFMaxF64FlatInsts,
+ FeatureFlatBufferGlobalAtomicFaddF64Inst
  ])>;
 
 def FeatureISAVersion9_0_C : FeatureSet<
@@ -1435,7 +1443,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
-   FeatureMemoryAtomicFAddF32DenormalSupport
+   FeatureMemoryAtomicFAddF32DenormalSupport,
+   FeatureFlatBufferGlobalAtomicFaddF64Inst
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1932,11 +1941,9 @@ def isGFX12Plus :
 def HasFlatAddressSpace : Predicate<"Subtarget->hasFlatAddressSpace()">,
   AssemblerPredicate<(all_of FeatureFlatAddressSpace)>;
 
-
-def HasBufferFlatGlobalAtomicsF64 : // FIXME: Rename to show it's only for fadd
-  Predicate<"Subtarget->hasBufferFlatGlobalAtomicsF64()">,
-  // FIXME: This is too coarse, and working around using pseudo's predicates 
on real instruction.
-  AssemblerPredicate<(any_of FeatureGFX90AInsts, FeatureGFX10Insts, 
FeatureSouthernIslands, FeatureSeaIslands)>;
+def HasFlatBufferGlobalAtomicFaddF64Inst :
+  Predicate<"Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst()">,
+  AssemblerPredicate<(any_of FeatureFlatBufferGlobalAtomicFaddF64Inst)>;
 
 def HasAtomicFMinFMaxF32GlobalInsts :
   Predicate<"Subtarget->hasAtomicFMinFMaxF32GlobalInsts()">,
diff --git a/llvm/lib/Target/AMDGPU/BUFInstructions.td 
b/llvm/lib/Target/AMDGPU/BUFInstructions.td
index 3b8d94b744000..a904c8483dbf5 100644
--- a/llvm/lib/Target/AMDGPU/BUFInstructions.td
+++ b/llvm/lib/Target/AMDGPU/BUFInstructions.td
@@ -1312,14 +1312,16 @@ let SubtargetPredicate = isGFX90APlus in {
   }
 } // End SubtargetPredicate = isGFX90APlus
 
-let SubtargetPredicate = HasBufferFlatGlobalAtomicsF64 in {
+let SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst in {
   defm BUFFER_ATOMIC_ADD_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_add_f64", 
VReg_64, f64>;
+} // End SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst
 
+let SubtargetPredicate = HasAtomicFMinFMaxF64GlobalInsts in {
   // Note the names can be buffer_atomic_fmin_x2/buffer_atomic_fmax_x2
   // depending on some subtargets.
   defm BUFFER_ATOMIC_MIN_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_min_f64", 
VReg_64, f64>;
   defm BUFFER_ATOMIC_MAX_F64 : MUBUF_Pseudo_Atomics<"buffer_atomic_max_f64", 
VReg_64, f64>;
-} // End SubtargetPredicate = HasBufferFlatGlobalAtomicsF64
+}
 
 def BUFFER_INV : MUBUF_Invalidate<"buffer_inv"> {
   let SubtargetPredicate = isGFX940Plus;
@@ -1836,9 +1838,9 @@ let SubtargetPredicate = 
HasAtomicBufferGlobalPkAddF16Insts in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fadd", v2f16, 
"BUFFER_ATOMIC_PK_ADD_F16", ["ret"]>;
 } // End SubtargetPredicate = HasAtomicBufferGlobalPkAddF16Insts
 
-let SubtargetPredicate = HasBufferFlatGlobalAtomicsF64 in {
+let SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fadd", f64, 
"BUFFER_ATOMIC_ADD_F64">;
-} // End SubtargetPredicate = HasBufferFlatGlobalAtomicsF64
+} // End SubtargetPredicate = HasFlatBufferGlobalAtomicFaddF64Inst
 
 let SubtargetPredicate = HasAtomicFMinFMaxF64GlobalInsts in {
   defm : SIBufferAtomicPat<"SIbuffer_atomic_fmin", f64, 
"BUFFER_ATOMIC_MIN_F64">;
diff --git a/llvm/lib/Target/AMDGPU/FLATInstructions.td 
b/llvm/lib/Target/AMDGPU/FLATInstructions.td
index 4bf8f20269a15..16dc019ede810 100644
--- 

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for global atomic fadd denormal support (PR #96443)

2024-07-03 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96443

>From dfefb503c35bb1744bffed759221d12f654c99d8 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 23 Jun 2024 16:44:08 +0200
Subject: [PATCH 1/3] AMDGPU: Add subtarget feature for global atomic fadd
 denormal support

Not sure what the behavior for gfx90a is. The SPG says it always flushes.
The instruction documentation says it does not.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 14 --
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  7 +++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 3f35db8883716..51c077598df74 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -788,6 +788,13 @@ def FeatureFlatAtomicFaddF32Inst
   "Has flat_atomic_add_f32 instruction"
 >;
 
+def FeatureMemoryAtomicFaddF32DenormalSupport
+  : SubtargetFeature<"memory-atomic-fadd-f32-denormal-support",
+  "HasAtomicMemoryAtomicFaddF32DenormalSupport",
+  "true",
+  "global/flat/buffer atomic fadd for float supports denormal handling"
+>;
+
 def FeatureAgentScopeFineGrainedRemoteMemoryAtomics
   : SubtargetFeature<"agent-scope-fine-grained-remote-memory-atomics",
   "HasAgentScopeFineGrainedRemoteMemoryAtomics",
@@ -1427,7 +1434,8 @@ def FeatureISAVersion9_4_Common : FeatureSet<
FeatureKernargPreload,
FeatureAtomicFMinFMaxF64GlobalInsts,
FeatureAtomicFMinFMaxF64FlatInsts,
-   FeatureAgentScopeFineGrainedRemoteMemoryAtomics
+   FeatureAgentScopeFineGrainedRemoteMemoryAtomics,
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion9_4_0 : FeatureSet<
@@ -1631,7 +1639,9 @@ def FeatureISAVersion12 : FeatureSet<
FeatureScalarDwordx3Loads,
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
-   Feature1_5xVGPRs]>;
+   Feature1_5xVGPRs,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   ]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<
   !listconcat(FeatureISAVersion12.Features,
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h 
b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 9e2a316a9ed28..db0b2b67a0388 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -167,6 +167,7 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
   bool HasAtomicFlatPkAdd16Insts = false;
   bool HasAtomicFaddRtnInsts = false;
   bool HasAtomicFaddNoRtnInsts = false;
+  bool HasAtomicMemoryAtomicFaddF32DenormalSupport = false;
   bool HasAtomicBufferGlobalPkAddF16NoRtnInsts = false;
   bool HasAtomicBufferGlobalPkAddF16Insts = false;
   bool HasAtomicCSubNoRtnInsts = false;
@@ -872,6 +873,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
 
   bool hasFlatAtomicFaddF32Inst() const { return HasFlatAtomicFaddF32Inst; }
 
+  /// \return true if the target's flat, global, and buffer atomic fadd for
+  /// float supports denormal handling.
+  bool hasMemoryAtomicFaddF32DenormalSupport() const {
+return HasAtomicMemoryAtomicFaddF32DenormalSupport;
+  }
+
   /// \return true if atomic operations targeting fine-grained memory work
   /// correctly at device scope, in allocations in host or peer PCIe device
   /// memory.

>From 09c73116a884c6de72f98fa859c7c56295f5b8eb Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Jun 2024 12:10:37 +0200
Subject: [PATCH 2/3] Add to gfx11.

RDNA 3 manual says "Floating-point addition handles NAN/INF/denorm"
thought I'm not sure I trust it.
---
 llvm/lib/Target/AMDGPU/AMDGPU.td | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 51c077598df74..370992eb81ff3 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1547,7 +1547,8 @@ def FeatureISAVersion11_Common : FeatureSet<
FeatureFlatAtomicFaddF32Inst,
FeatureImageInsts,
FeaturePackedTID,
-   FeatureVcmpxPermlaneHazard]>;
+   FeatureVcmpxPermlaneHazard,
+   FeatureMemoryAtomicFaddF32DenormalSupport]>;
 
 // There are few workarounds that need to be
 // added to all targets. This pessimizes codegen
@@ -1640,7 +1641,7 @@ def FeatureISAVersion12 : FeatureSet<
FeatureDPPSrc1SGPR,
FeatureMaxHardClauseLength32,
Feature1_5xVGPRs,
-   FeatureMemoryAtomicFaddF32DenormalSupport]>;
+   FeatureMemoryAtomicFaddF32DenormalSupport
]>;
 
 def FeatureISAVersion12_Generic: FeatureSet<

>From 9cf93c6ce502adf460d2432f29cc2aa3c0ccdd68 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 26 Jun 2024 11:30:51 +0200
Subject: [PATCH 3/3] Rename

---
 llvm/lib/Target/AMDGPU/AMDGPU.td  | 10 +-
 llvm/lib/Target/AMDGPU/GCNSubtarget.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 370992eb81ff3..bea233bfb27bd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ 

[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/8] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/8] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/95884

>From 34652b2eebc62218c50a23509ce99937385c30e6 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:42:00 -0700
Subject: [PATCH 1/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 73 --
 1 file changed, 56 insertions(+), 17 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 66cabc236f4b2..c9f6d88f0b13a 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -424,36 +424,75 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
 
   // Uses name similarity to match functions that were not matched by name.
   uint64_t MatchedWithDemangledName = 0;
-  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
-
-std::unordered_map NameToBinaryFunction;
-NameToBinaryFunction.reserve(BC.getBinaryFunctions().size());
 
-for (auto &[_, BF] : BC.getBinaryFunctions()) {
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
   int Status = 0;
-  char *DemangledName = abi::__cxa_demangle(BF.getOneName().str().c_str(),
+  char *DemangledName = abi::__cxa_demangle(String,
 nullptr, nullptr, );
-  if (Status == 0)
-NameToBinaryFunction[std::string(DemangledName)] = 
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
 }
 
 for (auto YamlBF : YamlBP.Functions) {
   if (YamlBF.Used)
 continue;
-  int Status = 0;
-  char *DemangledName =
-  abi::__cxa_demangle(YamlBF.Name.c_str(), nullptr, nullptr, );
-  if (Status != 0)
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
 continue;
-  auto It = NameToBinaryFunction.find(DemangledName);
-  if (It == NameToBinaryFunction.end())
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
 continue;
-  BinaryFunction *BF = It->second;
-  matchProfileToFunction(YamlBF, *BF);
-  ++MatchedWithDemangledName;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());
+if (!BFDemangledName)
+  continue;
+unsigned BFEditDistance = 
StringRef(*BFDemangledName).edit_distance(*YamlBFDemangledName);
+if (BFEditDistance < MinEditDistance) {
+  MinEditDistance = BFEditDistance;
+  ClosestNameBF = BF;
+}
+  }
+
+  if (ClosestNameBF &&
+MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold) {
+matchProfileToFunction(YamlBF, *ClosestNameBF);
+++MatchedWithDemangledName;
+  }
 }
   }
 
+  outs() << MatchedWithDemangledName  << ": functions matched by name 
similarity\n";
+
   for (yaml::bolt::BinaryFunctionProfile  : YamlBP.Functions)
 if (!YamlBF.Used && opts::Verbosity >= 1)
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name

>From 2d23bbd6b9ce4f0786ae8ceb39b1b008b4ca9c4d Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Thu, 20 Jun 2024 23:45:27 -0700
Subject: [PATCH 2/7] spr amend

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index c9f6d88f0b13a..cf4a5393df8f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -491,8 +491,6 @@ Error YAMLProfileReader::readProfile(BinaryContext ) {
  

[llvm-branch-commits] [llvm] [AArch64] Only create called thunks when hardening against SLS (PR #97472)

2024-07-03 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -36,38 +32,43 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
+static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+
 namespace {
 
-class AArch64SLSHardening : public MachineFunctionPass {
-public:
-  const TargetInstrInfo *TII;
-  const TargetRegisterInfo *TRI;
-  const AArch64Subtarget *ST;
+// Set of inserted thunks: bitmask with bits corresponding to
+// indexes in SLSBLRThunks array.
+typedef uint32_t ThunksSet;

atrosinenko wrote:

Here is the PR: #97605

https://github.com/llvm/llvm-project/pull/97472
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/97502

>From c6212e4b26b0f0d8abde323fa5fc04ecc6dd34fd Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 3 Jul 2024 09:45:46 -0700
Subject: [PATCH] Changed profileMatches comment

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h | 2 +-
 bolt/lib/Profile/YAMLProfileReader.cpp| 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index a5bd3544bd999..627cebf5d9453 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -73,7 +73,7 @@ class YAMLProfileReader : public ProfileReaderBase {
   bool parseFunctionProfile(BinaryFunction ,
 const yaml::bolt::BinaryFunctionProfile );
 
-  /// Returns block cnt equality if IgnoreHash is true, otherwise, hash 
equality
+  /// Checks if a function profile matches a binary function.
   bool profileMatches(const yaml::bolt::BinaryFunctionProfile ,
   BinaryFunction );
 
diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index e8ce187367899..91628d950e9f4 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -333,6 +333,7 @@ Error YAMLProfileReader::preprocessProfile(BinaryContext 
) {
 
   return Error::success();
 }
+
 bool YAMLProfileReader::profileMatches(
 const yaml::bolt::BinaryFunctionProfile , BinaryFunction ) {
   if (opts::IgnoreHash)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-03 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Anatoly Trosinenko (atrosinenko)


Changes

Make SLS Hardening pass handle BLRA* instructions the same way it handles BLR. 
The thunk names have the form

__llvm_slsblr_thunk_xNfor BLR thunks
__llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
__llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks

Now there are about 1800 possible thunk names, so do not rely on linear thunk 
function's name lookup and parse the name instead.

---

Patch is 23.27 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/97605.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64SLSHardening.cpp (+222-104) 
- (added) llvm/test/CodeGen/AArch64/speculation-hardening-sls-blra.mir (+210) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp 
b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
index feb166f30127a..d93fe2a875845 100644
--- a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
@@ -13,6 +13,7 @@
 
 #include "AArch64InstrInfo.h"
 #include "AArch64Subtarget.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/CodeGen/IndirectThunks.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunction.h"
@@ -23,6 +24,7 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Target/TargetMachine.h"
 #include 
 
@@ -32,17 +34,103 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
-static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+// Common name prefix of all thunks generated by this pass.
+//
+// The generic form is
+// __llvm_slsblr_thunk_xNfor BLR thunks
+// __llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
+// __llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks
+static constexpr StringRef CommonNamePrefix = "__llvm_slsblr_thunk_";
 
 namespace {
 
-// Set of inserted thunks: bitmask with bits corresponding to
-// indexes in SLSBLRThunks array.
-typedef uint32_t ThunksSet;
+struct ThunkKind {
+  enum ThunkKindId {
+ThunkBR,
+ThunkBRAA,
+ThunkBRAB,
+ThunkBRAAZ,
+ThunkBRABZ,
+  };
+
+  ThunkKindId Id;
+  StringRef NameInfix;
+  bool HasXmOperand;
+  bool NeedsPAuth;
+
+  // Opcode to perform indirect jump from inside the thunk.
+  unsigned BROpcode;
+
+  static const ThunkKind BR;
+  static const ThunkKind BRAA;
+  static const ThunkKind BRAB;
+  static const ThunkKind BRAAZ;
+  static const ThunkKind BRABZ;
+};
+
+// Set of inserted thunks.
+class ThunksSet {
+public:
+  static constexpr unsigned NumXRegisters = 32;
+
+  // Given Xn register, returns n.
+  static unsigned indexOfXReg(Register Xn);
+  // Given n, returns Xn register.
+  static Register xRegByIndex(unsigned N);
+
+  ThunksSet |=(const ThunksSet ) {
+BLRThunks |= Other.BLRThunks;
+BLRAAZThunks |= Other.BLRAAZThunks;
+BLRABZThunks |= Other.BLRABZThunks;
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRAAThunks[I] |= Other.BLRAAThunks[I];
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRABThunks[I] |= Other.BLRABThunks[I];
+
+return *this;
+  }
+
+  bool get(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+return getBitmask(Kind, Xm) & XnBit;
+  }
+
+  void set(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+getBitmask(Kind, Xm) |= XnBit;
+  }
+
+private:
+  // Bitmasks representing operands used, with n-th bit corresponding to Xn
+  // register operand. If the instruction has a second operand (Xm), an array
+  // of bitmasks is used, indexed by m.
+  // Indexes corresponding to the forbidden x16, x17 and x30 registers are
+  // always unset, for simplicity there are no holes.
+  uint32_t BLRThunks = 0;
+  uint32_t BLRAAZThunks = 0;
+  uint32_t BLRABZThunks = 0;
+  uint32_t BLRAAThunks[NumXRegisters] = {};
+  uint32_t BLRABThunks[NumXRegisters] = {};
+
+  uint32_t (ThunkKind::ThunkKindId Kind, Register Xm) {
+switch (Kind) {
+case ThunkKind::ThunkBR:
+  return BLRThunks;
+case ThunkKind::ThunkBRAAZ:
+  return BLRAAZThunks;
+case ThunkKind::ThunkBRABZ:
+  return BLRABZThunks;
+case ThunkKind::ThunkBRAA:
+  return BLRAAThunks[indexOfXReg(Xm)];
+case ThunkKind::ThunkBRAB:
+  return BLRABThunks[indexOfXReg(Xm)];
+}
+  }
+};
 
 struct SLSHardeningInserter : ThunkInserter {
 public:
-  const char *getThunkPrefix() { return SLSBLRNamePrefix; }
+  const char *getThunkPrefix() { return CommonNamePrefix.data(); }
   bool mayUseThunk(const MachineFunction ) {
 // FIXME: ComdatThunks is only accumulated until the first thunk is 
created.
 ComdatThunks &= !MF.getSubtarget().hardenSlsNoComdat();
@@ -69,6 

[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-03 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko created 
https://github.com/llvm/llvm-project/pull/97605

Make SLS Hardening pass handle BLRA* instructions the same way it handles BLR. 
The thunk names have the form

__llvm_slsblr_thunk_xNfor BLR thunks
__llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
__llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks

Now there are about 1800 possible thunk names, so do not rely on linear thunk 
function's name lookup and parse the name instead.

>From b389284b8e92f5bf09cea38f3f9a53974a84dc29 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Mon, 1 Jul 2024 20:13:54 +0300
Subject: [PATCH] [AArch64][PAC] Support BLRA* instructions in SLS Hardening
 pass

Make SLS Hardening pass handle BLRA* instructions the same way it
handles BLR. The thunk names have the form

__llvm_slsblr_thunk_xNfor BLR thunks
__llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
__llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks

Now there are about 1800 possible thunk names, so do not rely on linear
thunk function's name lookup and parse the name instead.
---
 .../Target/AArch64/AArch64SLSHardening.cpp| 326 --
 .../speculation-hardening-sls-blra.mir| 210 +++
 2 files changed, 432 insertions(+), 104 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/speculation-hardening-sls-blra.mir

diff --git a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp 
b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
index feb166f30127a..d93fe2a875845 100644
--- a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
@@ -13,6 +13,7 @@
 
 #include "AArch64InstrInfo.h"
 #include "AArch64Subtarget.h"
+#include "llvm/ADT/StringSwitch.h"
 #include "llvm/CodeGen/IndirectThunks.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunction.h"
@@ -23,6 +24,7 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/Target/TargetMachine.h"
 #include 
 
@@ -32,17 +34,103 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
-static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+// Common name prefix of all thunks generated by this pass.
+//
+// The generic form is
+// __llvm_slsblr_thunk_xNfor BLR thunks
+// __llvm_slsblr_thunk_(aaz|abz)_xN  for BLRAAZ and BLRABZ thunks
+// __llvm_slsblr_thunk_(aa|ab)_xN_xM for BLRAA and BLRAB thunks
+static constexpr StringRef CommonNamePrefix = "__llvm_slsblr_thunk_";
 
 namespace {
 
-// Set of inserted thunks: bitmask with bits corresponding to
-// indexes in SLSBLRThunks array.
-typedef uint32_t ThunksSet;
+struct ThunkKind {
+  enum ThunkKindId {
+ThunkBR,
+ThunkBRAA,
+ThunkBRAB,
+ThunkBRAAZ,
+ThunkBRABZ,
+  };
+
+  ThunkKindId Id;
+  StringRef NameInfix;
+  bool HasXmOperand;
+  bool NeedsPAuth;
+
+  // Opcode to perform indirect jump from inside the thunk.
+  unsigned BROpcode;
+
+  static const ThunkKind BR;
+  static const ThunkKind BRAA;
+  static const ThunkKind BRAB;
+  static const ThunkKind BRAAZ;
+  static const ThunkKind BRABZ;
+};
+
+// Set of inserted thunks.
+class ThunksSet {
+public:
+  static constexpr unsigned NumXRegisters = 32;
+
+  // Given Xn register, returns n.
+  static unsigned indexOfXReg(Register Xn);
+  // Given n, returns Xn register.
+  static Register xRegByIndex(unsigned N);
+
+  ThunksSet |=(const ThunksSet ) {
+BLRThunks |= Other.BLRThunks;
+BLRAAZThunks |= Other.BLRAAZThunks;
+BLRABZThunks |= Other.BLRABZThunks;
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRAAThunks[I] |= Other.BLRAAThunks[I];
+for (unsigned I = 0; I < NumXRegisters; ++I)
+  BLRABThunks[I] |= Other.BLRABThunks[I];
+
+return *this;
+  }
+
+  bool get(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+return getBitmask(Kind, Xm) & XnBit;
+  }
+
+  void set(ThunkKind::ThunkKindId Kind, Register Xn, Register Xm) {
+uint32_t XnBit = 1u << indexOfXReg(Xn);
+getBitmask(Kind, Xm) |= XnBit;
+  }
+
+private:
+  // Bitmasks representing operands used, with n-th bit corresponding to Xn
+  // register operand. If the instruction has a second operand (Xm), an array
+  // of bitmasks is used, indexed by m.
+  // Indexes corresponding to the forbidden x16, x17 and x30 registers are
+  // always unset, for simplicity there are no holes.
+  uint32_t BLRThunks = 0;
+  uint32_t BLRAAZThunks = 0;
+  uint32_t BLRABZThunks = 0;
+  uint32_t BLRAAThunks[NumXRegisters] = {};
+  uint32_t BLRABThunks[NumXRegisters] = {};
+
+  uint32_t (ThunkKind::ThunkKindId Kind, Register Xm) {
+switch (Kind) {
+case ThunkKind::ThunkBR:
+  return BLRThunks;
+case ThunkKind::ThunkBRAAZ:
+  return BLRAAZThunks;
+case ThunkKind::ThunkBRABZ:

[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung edited 
https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Shaw Young via llvm-branch-commits


@@ -73,13 +73,26 @@ class YAMLProfileReader : public ProfileReaderBase {
   bool parseFunctionProfile(BinaryFunction ,
 const yaml::bolt::BinaryFunctionProfile );
 
+  /// Returns block cnt equality if IgnoreHash is true, otherwise, hash 
equality
+  bool profileMatches(const yaml::bolt::BinaryFunctionProfile ,
+  BinaryFunction );
+
   /// Infer function profile from stale data (collected on older binaries).
   bool inferStaleProfile(BinaryFunction ,
  const yaml::bolt::BinaryFunctionProfile );
 
   /// Initialize maps for profile matching.
   void buildNameMaps(BinaryContext );
 
+  /// Matches functions using exact name.
+  size_t matchWithExactName();

shawbyoung wrote:

I'm moving the different matching techniques into separate functions because 
it'll be easier to understand and prevent the YAMLProfileReader::readProfile 
function from getting abhorrently large as I'll be adding call graph function 
matching to it in a subsequent PR. I'll add this explanation to the 
description. 

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LG with a couple of nits.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction ) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext ) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string ) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string ) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();

aaupov wrote:

```suggestion
size_t BufferSize;
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction ) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext ) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string ) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string ) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName([0], );
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto  : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())

aaupov wrote:

```suggestion
// Skip if there are no ProfileBFs with a given \p Namespace.
if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction ) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext ) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string ) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string ) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName([0], );
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto  : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
+  continue;
+if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)

aaupov wrote:

```suggestion
// Skip if there are no ProfileBFs in a given \p Namespace with
// equal number of blocks.
if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction ) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext ) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string ) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string ) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName([0], );
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto  : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
+  continue;
+if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)
+  continue;
+auto NamespaceToBFsIt = NamespaceToBFs.find(Namespace);
+if (NamespaceToBFsIt == NamespaceToBFs.end())
+  NamespaceToBFs[Namespace] = {BF};
+else
+  NamespaceToBFsIt->second.push_back(BF);
+  }
+
+  // Iterates through all profiled functions and binary functions belonging to
+  // the same namespace and matches based on edit distance thresehold.

aaupov wrote:

```suggestion
  // the same namespace and matches based on edit distance threshold.
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction ) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext ) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string ) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string ) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName([0], );
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto  : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
+  continue;
+if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)
+  continue;
+auto NamespaceToBFsIt = NamespaceToBFs.find(Namespace);
+if (NamespaceToBFsIt == NamespaceToBFs.end())
+  NamespaceToBFs[Namespace] = {BF};
+else
+  NamespaceToBFsIt->second.push_back(BF);
+  }
+
+  // Iterates through all profiled functions and binary functions belonging to
+  // the same namespace and matches based on edit distance thresehold.
+  assert(YamlBP.Functions.size() == ProfiledBFNamespaces.size() &&
+ ProfiledBFNamespaces.size() == ProfileBFDemangledNames.size());
+  for (size_t I = 0; I < YamlBP.Functions.size(); ++I) {
+yaml::bolt::BinaryFunctionProfile  = YamlBP.Functions[I];
+std::string  = ProfiledBFNamespaces[I];
+if (YamlBF.Used)
+  continue;
+auto It = NamespaceToBFs.find(YamlBFNamespace);

aaupov wrote:

```suggestion
// Skip if there are no BFs in a given \p Namespace.
auto It = NamespaceToBFs.find(YamlBFNamespace);
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Davide Italiano via llvm-branch-commits

dcci wrote:

> I have a couple of general comments about this. Can you also please add a 
> description explaining what this patch does?

i.e. why we're refactoring these functions.

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Davide Italiano via llvm-branch-commits

https://github.com/dcci edited https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Davide Italiano via llvm-branch-commits

https://github.com/dcci commented:

I have a couple of general comments about this.
Can you also please add a description explaining what this patch does?

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Davide Italiano via llvm-branch-commits


@@ -73,13 +73,26 @@ class YAMLProfileReader : public ProfileReaderBase {
   bool parseFunctionProfile(BinaryFunction ,
 const yaml::bolt::BinaryFunctionProfile );
 
+  /// Returns block cnt equality if IgnoreHash is true, otherwise, hash 
equality
+  bool profileMatches(const yaml::bolt::BinaryFunctionProfile ,

dcci wrote:

I think this comment talks about the implementation more than the definition. 
Can you rephrase it?

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Davide Italiano via llvm-branch-commits


@@ -73,13 +73,26 @@ class YAMLProfileReader : public ProfileReaderBase {
   bool parseFunctionProfile(BinaryFunction ,
 const yaml::bolt::BinaryFunctionProfile );
 
+  /// Returns block cnt equality if IgnoreHash is true, otherwise, hash 
equality
+  bool profileMatches(const yaml::bolt::BinaryFunctionProfile ,
+  BinaryFunction );
+
   /// Infer function profile from stale data (collected on older binaries).
   bool inferStaleProfile(BinaryFunction ,
  const yaml::bolt::BinaryFunctionProfile );
 
   /// Initialize maps for profile matching.
   void buildNameMaps(BinaryContext );
 
+  /// Matches functions using exact name.
+  size_t matchWithExactName();

dcci wrote:

why you need these 3 different functions?

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Peter Smith via llvm-branch-commits


@@ -1861,7 +1886,15 @@ template  Error 
ELFBuilder::readSections(bool EnsureSymtab) {
 
   const typename ELFFile::Elf_Shdr *Shdr =
   Sections->begin() + RelSec->Index;
-  if (RelSec->Type == SHT_REL) {
+  if (RelSec->Type == SHT_CREL) {
+auto Rels = ElfFile.crels(*Shdr);

smithp35 wrote:

Would `RelsOrRelas` be a better name as it will make the meaning of first and 
second more obvious at the point of use?

https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Peter Smith via llvm-branch-commits

https://github.com/smithp35 edited 
https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Support CREL (PR #97521)

2024-07-03 Thread Peter Smith via llvm-branch-commits

https://github.com/smithp35 commented:

Only a couple of small comments from me. I'll be out of the office till Monday 
next week, I'm fine for others to progress this wihout me.

https://github.com/llvm/llvm-project/pull/97521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


<    1   2   3   4   5   6   7   8   9   10   >