[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-26 Thread Yingchi Long via llvm-branch-commits

inclyc wrote:

@eddyz87 Could you please take a look? This has been stalled for a while :)

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] PR for llvm/llvm-project#79507 (PR #79509)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: e2521eaa1aac43b3216378d529096f08ac98cf14 
3d02473ac538f542fb76c4aff0fb6504398c3f15 
e9d99e51834e2bf0b39c23a60f2dba5539edd17b

https://github.com/llvm/llvm-project/pull/79509
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] PR for llvm/llvm-project#79507 (PR #79509)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79509
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] e2521ea - [ELF] Implement R_RISCV_TLSDESC for RISC-V

2024-01-26 Thread Tom Stellard via llvm-branch-commits

Author: Fangrui Song
Date: 2024-01-26T21:34:49-08:00
New Revision: e2521eaa1aac43b3216378d529096f08ac98cf14

URL: 
https://github.com/llvm/llvm-project/commit/e2521eaa1aac43b3216378d529096f08ac98cf14
DIFF: 
https://github.com/llvm/llvm-project/commit/e2521eaa1aac43b3216378d529096f08ac98cf14.diff

LOG: [ELF] Implement R_RISCV_TLSDESC for RISC-V

Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,; addi 
a0,a0,`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)  # if relax: remove; otherwise, 
NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, 
NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: 
remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` 
(https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` 
(https://reviews.llvm.org/D116900).

Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373

Reviewed By: ilovepi

Pull Request: https://github.com/llvm/llvm-project/pull/79239

(cherry picked from commit 1117fdd7c16873eb389e988c6a39ad922bae0fd0)

Added: 
lld/test/ELF/riscv-tlsdesc-gd-mixed.s
lld/test/ELF/riscv-tlsdesc-relax.s
lld/test/ELF/riscv-tlsdesc.s

Modified: 
lld/ELF/Arch/RISCV.cpp
lld/ELF/Relocations.cpp

Removed: 




diff  --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index a92f7bf590c4b4..8ce92b4badfbd7 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -61,6 +61,7 @@ enum Op {
   AUIPC = 0x17,
   JALR = 0x67,
   LD = 0x3003,
+  LUI = 0x37,
   LW = 0x2003,
   SRLI = 0x5013,
   SUB = 0x4033,
@@ -73,6 +74,7 @@ enum Reg {
   X_T0 = 5,
   X_T1 = 6,
   X_T2 = 7,
+  X_A0 = 10,
   X_T3 = 28,
 };
 
@@ -139,6 +141,7 @@ RISCV::RISCV() {
 tlsGotRel = R_RISCV_TLS_TPREL32;
   }
   gotRel = symbolicRel;
+  tlsDescRel = R_RISCV_TLSDESC;
 
   // .got[0] = _DYNAMIC
   gotHeaderEntriesNum = 1;
@@ -207,6 +210,8 @@ int64_t RISCV::getImplicitAddend(const uint8_t *buf, 
RelType type) const {
   case R_RISCV_JUMP_SLOT:
 // These relocations are defined as not having an implicit addend.
 return 0;
+  case R_RISCV_TLSDESC:
+return config->is64 ? read64le(buf + 8) : read32le(buf + 4);
   }
 }
 
@@ -315,6 +320,12 @@ RelExpr RISCV::getRelExpr(const RelType type, const Symbol 
&s,
   case R_RISCV_PCREL_LO12_I:
   case R_RISCV_PCREL_LO12_S:
 return R_RISCV_PC_INDIRECT;
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
+return R_TLSDESC_PC;
+  case R_RISCV_TLSDESC_CALL:
+return R_TLSDESC_CALL;
   case R_RISCV_TLS_GD_HI20:
 return R_TLSGD_PC;
   case R_RISCV_TLS_GOT_HI20:
@@ -439,6 +450,7 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
 
   case R_RISCV_GOT_HI20:
   case R_RISCV_PCREL_HI20:
+  case R_RISCV_TLSDESC_HI20:
   case R_RISCV_TLS_GD_HI20:
   case R_RISCV_TLS_GOT_HI20:
   case R_RISCV_TPREL_HI20:
@@ -450,6 +462,8 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
   }
 
   case R_RISCV_PCREL_LO12_I:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
   case R_RISCV_TPREL_LO12_I:
   case R_RISCV_LO12_I: {
 uint64_t hi = (val + 0x800) >> 12;
@@ -533,8 +547,14 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+ 

[llvm-branch-commits] [lld] 3d02473 - [ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC

2024-01-26 Thread Tom Stellard via llvm-branch-commits

Author: Fangrui Song
Date: 2024-01-26T21:34:49-08:00
New Revision: 3d02473ac538f542fb76c4aff0fb6504398c3f15

URL: 
https://github.com/llvm/llvm-project/commit/3d02473ac538f542fb76c4aff0fb6504398c3f15
DIFF: 
https://github.com/llvm/llvm-project/commit/3d02473ac538f542fb76c4aff0fb6504398c3f15.diff

LOG: [ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC

(cherry picked from commit 849951f8759171cb6c74d3ccbcf154506fc1f0ae)

Added: 


Modified: 
lld/ELF/Relocations.cpp

Removed: 




diff  --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index b6a317bc3b6d697..3490a701d7189f5 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1286,17 +1286,16 @@ static unsigned handleTlsRelocation(RelType type, 
Symbol &sym,
   }
 
   // ARM, Hexagon, LoongArch and RISC-V do not support GD/LD to IE/LE
-  // relaxation.
+  // optimizations.
   // For PPC64, if the file has missing R_PPC64_TLSGD/R_PPC64_TLSLD, disable
-  // relaxation as well.
-  bool toExecRelax = !config->shared && config->emachine != EM_ARM &&
- config->emachine != EM_HEXAGON &&
- config->emachine != EM_LOONGARCH &&
- config->emachine != EM_RISCV &&
- !c.file->ppc64DisableTLSRelax;
+  // optimization as well.
+  bool execOptimize =
+  !config->shared && config->emachine != EM_ARM &&
+  config->emachine != EM_HEXAGON && config->emachine != EM_LOONGARCH &&
+  config->emachine != EM_RISCV && !c.file->ppc64DisableTLSRelax;
 
   // If we are producing an executable and the symbol is non-preemptable, it
-  // must be defined and the code sequence can be relaxed to use Local-Exec.
+  // must be defined and the code sequence can be optimized to use Local-Exec.
   //
   // ARM and RISC-V do not support any relaxations for TLS relocations, 
however,
   // we can omit the DTPMOD dynamic relocations and resolve them at link time
@@ -1309,8 +1308,8 @@ static unsigned handleTlsRelocation(RelType type, Symbol 
&sym,
   // module index, with a special value of 0 for the current module. GOT[e1] is
   // unused. There only needs to be one module index entry.
   if (oneof(expr)) {
-// Local-Dynamic relocs can be relaxed to Local-Exec.
-if (toExecRelax) {
+// Local-Dynamic relocs can be optimized to Local-Exec.
+if (execOptimize) {
   c.addReloc({target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE), type,
   offset, addend, &sym});
   return target->getTlsGdRelaxSkip(type);
@@ -1322,16 +1321,17 @@ static unsigned handleTlsRelocation(RelType type, 
Symbol &sym,
 return 1;
   }
 
-  // Local-Dynamic relocs can be relaxed to Local-Exec.
+  // Local-Dynamic relocs can be optimized to Local-Exec.
   if (expr == R_DTPREL) {
-if (toExecRelax)
+if (execOptimize)
   expr = target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE);
 c.addReloc({expr, type, offset, addend, &sym});
 return 1;
   }
 
   // Local-Dynamic sequence where offset of tls variable relative to dynamic
-  // thread pointer is stored in the got. This cannot be relaxed to Local-Exec.
+  // thread pointer is stored in the got. This cannot be optimized to
+  // Local-Exec.
   if (expr == R_TLSLD_GOT_OFF) {
 sym.setFlags(NEEDS_GOT_DTPREL);
 c.addReloc({expr, type, offset, addend, &sym});
@@ -1341,13 +1341,13 @@ static unsigned handleTlsRelocation(RelType type, 
Symbol &sym,
   if (oneof(expr)) {
-if (!toExecRelax) {
+if (!execOptimize) {
   sym.setFlags(NEEDS_TLSGD);
   c.addReloc({expr, type, offset, addend, &sym});
   return 1;
 }
 
-// Global-Dynamic relocs can be relaxed to Initial-Exec or Local-Exec
+// Global-Dynamic/TLSDESC can be optimized to Initial-Exec or Local-Exec
 // depending on the symbol being locally defined or not.
 if (sym.isPreemptible) {
   sym.setFlags(NEEDS_TLSGD_TO_IE);
@@ -1363,9 +1363,9 @@ static unsigned handleTlsRelocation(RelType type, Symbol 
&sym,
   if (oneof(expr)) {
 ctx.hasTlsIe.store(true, std::memory_order_relaxed);
-// Initial-Exec relocs can be relaxed to Local-Exec if the symbol is 
locally
-// defined.
-if (toExecRelax && isLocalInExecutable) {
+// Initial-Exec relocs can be optimized to Local-Exec if the symbol is
+// locally defined.
+if (execOptimize && isLocalInExecutable) {
   c.addReloc({R_RELAX_TLS_IE_TO_LE, type, offset, addend, &sym});
 } else if (expr != R_TLSIE_HINT) {
   sym.setFlags(NEEDS_TLSIE);
@@ -1463,7 +1463,7 @@ template  void 
RelocationScanner::scanOne(RelTy *&i) {
 in.got->hasGotOffRel.store(true, std::memory_order_relaxed);
   }
 
-  // Process TLS relocations, including relaxing TLS relocations. Note that
+  // Process TLS relocations, including TLS optimizations. Note that
   // R_TPREL and R_TPREL_NEG relocations are resolved in processAux.
   if (sym.isTls()) {
 if (unsigne

[llvm-branch-commits] [lld] e9d99e5 - [ELF] Clean up R_RISCV_RELAX code. NFC

2024-01-26 Thread Tom Stellard via llvm-branch-commits

Author: Fangrui Song
Date: 2024-01-26T21:34:48-08:00
New Revision: e9d99e51834e2bf0b39c23a60f2dba5539edd17b

URL: 
https://github.com/llvm/llvm-project/commit/e9d99e51834e2bf0b39c23a60f2dba5539edd17b
DIFF: 
https://github.com/llvm/llvm-project/commit/e9d99e51834e2bf0b39c23a60f2dba5539edd17b.diff

LOG: [ELF] Clean up R_RISCV_RELAX code. NFC

(cherry picked from commit ccb99f221422b8de5e1ae04d3427f15878f7cd93)

Added: 


Modified: 
lld/ELF/Arch/RISCV.cpp

Removed: 




diff  --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index d7d3d3e4781497..a92f7bf590c4b4 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -102,6 +102,26 @@ static uint32_t setLO12_S(uint32_t insn, uint32_t imm) {
  (extractBits(imm, 4, 0) << 7);
 }
 
+namespace {
+struct SymbolAnchor {
+  uint64_t offset;
+  Defined *d;
+  bool end; // true for the anchor of st_value+st_size
+};
+} // namespace
+
+struct elf::RISCVRelaxAux {
+  // This records symbol start and end offsets which will be adjusted according
+  // to the nearest relocDeltas element.
+  SmallVector anchors;
+  // For relocations[i], the actual offset is
+  //   r_offset - (i ? relocDeltas[i-1] : 0).
+  std::unique_ptr relocDeltas;
+  // For relocations[i], the actual type is relocTypes[i].
+  std::unique_ptr relocTypes;
+  SmallVector writes;
+};
+
 RISCV::RISCV() {
   copyRel = R_RISCV_COPY;
   pltRel = R_RISCV_JUMP_SLOT;
@@ -520,14 +540,19 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
   }
 }
 
+static bool relaxable(ArrayRef relocs, size_t i) {
+  return i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX;
+}
+
 void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
   uint64_t secAddr = sec.getOutputSection()->addr;
   if (auto *s = dyn_cast(&sec))
 secAddr += s->outSecOff;
   else if (auto *ehIn = dyn_cast(&sec))
 secAddr += ehIn->getParent()->outSecOff;
-  for (size_t i = 0, size = sec.relocs().size(); i != size; ++i) {
-const Relocation &rel = sec.relocs()[i];
+  const ArrayRef relocs = sec.relocs();
+  for (size_t i = 0, size = relocs.size(); i != size; ++i) {
+const Relocation &rel = relocs[i];
 uint8_t *loc = buf + rel.offset;
 const uint64_t val =
 sec.getRelocTargetVA(sec.file, rel.type, rel.addend,
@@ -538,7 +563,7 @@ void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t 
*buf) const {
   break;
 case R_RISCV_LEB128:
   if (i + 1 < size) {
-const Relocation &rel1 = sec.relocs()[i + 1];
+const Relocation &rel1 = relocs[i + 1];
 if (rel.type == R_RISCV_SET_ULEB128 &&
 rel1.type == R_RISCV_SUB_ULEB128 && rel.offset == rel1.offset) {
   auto val = rel.sym->getVA(rel.addend) - rel1.sym->getVA(rel1.addend);
@@ -560,26 +585,6 @@ void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t 
*buf) const {
   }
 }
 
-namespace {
-struct SymbolAnchor {
-  uint64_t offset;
-  Defined *d;
-  bool end; // true for the anchor of st_value+st_size
-};
-} // namespace
-
-struct elf::RISCVRelaxAux {
-  // This records symbol start and end offsets which will be adjusted according
-  // to the nearest relocDeltas element.
-  SmallVector anchors;
-  // For relocations[i], the actual offset is r_offset - (i ? relocDeltas[i-1] 
:
-  // 0).
-  std::unique_ptr relocDeltas;
-  // For relocations[i], the actual type is relocTypes[i].
-  std::unique_ptr relocTypes;
-  SmallVector writes;
-};
-
 static void initSymbolAnchors() {
   SmallVector storage;
   for (OutputSection *osec : outputSections) {
@@ -715,14 +720,15 @@ static void relaxHi20Lo12(const InputSection &sec, size_t 
i, uint64_t loc,
 
 static bool relax(InputSection &sec) {
   const uint64_t secAddr = sec.getVA();
+  const MutableArrayRef relocs = sec.relocs();
   auto &aux = *sec.relaxAux;
   bool changed = false;
   ArrayRef sa = ArrayRef(aux.anchors);
   uint64_t delta = 0;
 
-  std::fill_n(aux.relocTypes.get(), sec.relocs().size(), R_RISCV_NONE);
+  std::fill_n(aux.relocTypes.get(), relocs.size(), R_RISCV_NONE);
   aux.writes.clear();
-  for (auto [i, r] : llvm::enumerate(sec.relocs())) {
+  for (auto [i, r] : llvm::enumerate(relocs)) {
 const uint64_t loc = secAddr + r.offset - delta;
 uint32_t &cur = aux.relocDeltas[i], remove = 0;
 switch (r.type) {
@@ -743,23 +749,20 @@ static bool relax(InputSection &sec) {
 }
 case R_RISCV_CALL:
 case R_RISCV_CALL_PLT:
-  if (i + 1 != sec.relocs().size() &&
-  sec.relocs()[i + 1].type == R_RISCV_RELAX)
+  if (relaxable(relocs, i))
 relaxCall(sec, i, loc, r, remove);
   break;
 case R_RISCV_TPREL_HI20:
 case R_RISCV_TPREL_ADD:
 case R_RISCV_TPREL_LO12_I:
 case R_RISCV_TPREL_LO12_S:
-  if (i + 1 != sec.relocs().size() &&
-  sec.relocs()[i + 1].type == R_RISCV_RELAX)
+  if (relaxable(relocs, i))
 relaxTlsLe(sec, i, loc, r, remove

[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79596)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@topperc This backport has some failing tests that will have to be fixed up 
manually and submitted in another PR.

https://github.com/llvm/llvm-project/pull/79596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79451 (PR #79457)

2024-01-26 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/79457
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79451 (PR #79457)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79451

---

Patch is 44.33 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79457.diff


8 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+37-2) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+49-19) 
- (modified) llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (+9-23) 
- (modified) llvm/test/CodeGen/AMDGPU/indirect-call-known-callees.ll (-1) 
- (added) llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-hsa.ll (+295) 
- (added) llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-pal.ll (+187) 
- (removed) llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics.ll (-128) 
- (modified) llvm/test/CodeGen/AMDGPU/workgroup-id-in-arch-sgprs.ll (+50-79) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 32921bb248caf07..615685822f91eeb 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -4178,10 +4178,45 @@ bool AMDGPULegalizerInfo::loadInputValue(
 Register DstReg, MachineIRBuilder &B,
 AMDGPUFunctionArgInfo::PreloadedValue ArgType) const {
   const SIMachineFunctionInfo *MFI = 
B.getMF().getInfo();
-  const ArgDescriptor *Arg;
+  const ArgDescriptor *Arg = nullptr;
   const TargetRegisterClass *ArgRC;
   LLT ArgTy;
-  std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType);
+
+  CallingConv::ID CC = B.getMF().getFunction().getCallingConv();
+  const ArgDescriptor WorkGroupIDX =
+  ArgDescriptor::createRegister(AMDGPU::TTMP9);
+  // If GridZ is not programmed in an entry function then the hardware will set
+  // it to all zeros, so there is no need to mask the GridY value in the low
+  // order bits.
+  const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister(
+  AMDGPU::TTMP7,
+  AMDGPU::isEntryFunctionCC(CC) && !MFI->hasWorkGroupIDZ() ? ~0u : 
0xu);
+  const ArgDescriptor WorkGroupIDZ =
+  ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu);
+  if (ST.hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) {
+switch (ArgType) {
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_X:
+  Arg = &WorkGroupIDX;
+  ArgRC = &AMDGPU::SReg_32RegClass;
+  ArgTy = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y:
+  Arg = &WorkGroupIDY;
+  ArgRC = &AMDGPU::SReg_32RegClass;
+  ArgTy = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z:
+  Arg = &WorkGroupIDZ;
+  ArgRC = &AMDGPU::SReg_32RegClass;
+  ArgTy = LLT::scalar(32);
+  break;
+default:
+  break;
+}
+  }
+
+  if (!Arg)
+std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType);
 
   if (!Arg) {
 if (ArgType == AMDGPUFunctionArgInfo::KERNARG_SEGMENT_PTR) {
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index d35b76c8ad54ebc..d60f511302613e1 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -2072,11 +2072,45 @@ SDValue 
SITargetLowering::getPreloadedValue(SelectionDAG &DAG,
   const SIMachineFunctionInfo &MFI,
   EVT VT,
   AMDGPUFunctionArgInfo::PreloadedValue PVID) const {
-  const ArgDescriptor *Reg;
+  const ArgDescriptor *Reg = nullptr;
   const TargetRegisterClass *RC;
   LLT Ty;
 
-  std::tie(Reg, RC, Ty) = MFI.getPreloadedValue(PVID);
+  CallingConv::ID CC = DAG.getMachineFunction().getFunction().getCallingConv();
+  const ArgDescriptor WorkGroupIDX =
+  ArgDescriptor::createRegister(AMDGPU::TTMP9);
+  // If GridZ is not programmed in an entry function then the hardware will set
+  // it to all zeros, so there is no need to mask the GridY value in the low
+  // order bits.
+  const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister(
+  AMDGPU::TTMP7,
+  AMDGPU::isEntryFunctionCC(CC) && !MFI.hasWorkGroupIDZ() ? ~0u : 0xu);
+  const ArgDescriptor WorkGroupIDZ =
+  ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu);
+  if (Subtarget->hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) {
+switch (PVID) {
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_X:
+  Reg = &WorkGroupIDX;
+  RC = &AMDGPU::SReg_32RegClass;
+  Ty = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y:
+  Reg = &WorkGroupIDY;
+  RC = &AMDGPU::SReg_32RegClass;
+  Ty = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z:
+  Reg = &WorkGroupIDZ;
+  RC = &AMDGPU::SReg_32RegClass;
+  Ty = LLT::scalar(32);
+  break;
+default:
+  break;
+}
+  }
+
+  if (!Reg)
+std::tie(Reg, RC, Ty) = MFI.getPreloadedValue(PVID);
   if (!Reg) {
 if (PVID == AMDGPUFunctionArgInfo::PreloadedValue::KERNARG_SEGMENT_PTR) {
   // It's possible for a ker

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79451 (PR #79457)

2024-01-26 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/79457

>From 585d833f346e34aa5bb9d732467d71e4177a8a50 Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Wed, 24 Jan 2024 15:06:20 +
Subject: [PATCH] [AMDGPU] Move architected SGPR implementation into isel
 (#79120)

(cherry picked from commit 70fc9703788e8965813c5b677a85cb84b66671b6)
---
 .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp |  39 ++-
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  68 ++--
 .../lib/Target/AMDGPU/SIMachineFunctionInfo.h |  32 +-
 .../AMDGPU/indirect-call-known-callees.ll |   1 -
 .../lower-work-group-id-intrinsics-hsa.ll | 295 ++
 .../lower-work-group-id-intrinsics-pal.ll | 187 +++
 .../AMDGPU/lower-work-group-id-intrinsics.ll  | 128 
 .../AMDGPU/workgroup-id-in-arch-sgprs.ll  | 129 +++-
 8 files changed, 627 insertions(+), 252 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-hsa.ll
 create mode 100644 
llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-pal.ll
 delete mode 100644 llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics.ll

diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 32921bb248caf07..615685822f91eeb 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -4178,10 +4178,45 @@ bool AMDGPULegalizerInfo::loadInputValue(
 Register DstReg, MachineIRBuilder &B,
 AMDGPUFunctionArgInfo::PreloadedValue ArgType) const {
   const SIMachineFunctionInfo *MFI = 
B.getMF().getInfo();
-  const ArgDescriptor *Arg;
+  const ArgDescriptor *Arg = nullptr;
   const TargetRegisterClass *ArgRC;
   LLT ArgTy;
-  std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType);
+
+  CallingConv::ID CC = B.getMF().getFunction().getCallingConv();
+  const ArgDescriptor WorkGroupIDX =
+  ArgDescriptor::createRegister(AMDGPU::TTMP9);
+  // If GridZ is not programmed in an entry function then the hardware will set
+  // it to all zeros, so there is no need to mask the GridY value in the low
+  // order bits.
+  const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister(
+  AMDGPU::TTMP7,
+  AMDGPU::isEntryFunctionCC(CC) && !MFI->hasWorkGroupIDZ() ? ~0u : 
0xu);
+  const ArgDescriptor WorkGroupIDZ =
+  ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu);
+  if (ST.hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) {
+switch (ArgType) {
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_X:
+  Arg = &WorkGroupIDX;
+  ArgRC = &AMDGPU::SReg_32RegClass;
+  ArgTy = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y:
+  Arg = &WorkGroupIDY;
+  ArgRC = &AMDGPU::SReg_32RegClass;
+  ArgTy = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z:
+  Arg = &WorkGroupIDZ;
+  ArgRC = &AMDGPU::SReg_32RegClass;
+  ArgTy = LLT::scalar(32);
+  break;
+default:
+  break;
+}
+  }
+
+  if (!Arg)
+std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType);
 
   if (!Arg) {
 if (ArgType == AMDGPUFunctionArgInfo::KERNARG_SEGMENT_PTR) {
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index d35b76c8ad54ebc..d60f511302613e1 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -2072,11 +2072,45 @@ SDValue 
SITargetLowering::getPreloadedValue(SelectionDAG &DAG,
   const SIMachineFunctionInfo &MFI,
   EVT VT,
   AMDGPUFunctionArgInfo::PreloadedValue PVID) const {
-  const ArgDescriptor *Reg;
+  const ArgDescriptor *Reg = nullptr;
   const TargetRegisterClass *RC;
   LLT Ty;
 
-  std::tie(Reg, RC, Ty) = MFI.getPreloadedValue(PVID);
+  CallingConv::ID CC = DAG.getMachineFunction().getFunction().getCallingConv();
+  const ArgDescriptor WorkGroupIDX =
+  ArgDescriptor::createRegister(AMDGPU::TTMP9);
+  // If GridZ is not programmed in an entry function then the hardware will set
+  // it to all zeros, so there is no need to mask the GridY value in the low
+  // order bits.
+  const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister(
+  AMDGPU::TTMP7,
+  AMDGPU::isEntryFunctionCC(CC) && !MFI.hasWorkGroupIDZ() ? ~0u : 0xu);
+  const ArgDescriptor WorkGroupIDZ =
+  ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu);
+  if (Subtarget->hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) {
+switch (PVID) {
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_X:
+  Reg = &WorkGroupIDX;
+  RC = &AMDGPU::SReg_32RegClass;
+  Ty = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y:
+  Reg = &WorkGroupIDY;
+  RC = &AMDGPU::SReg_32RegClass;
+  Ty = LLT::scalar(32);
+  break;
+case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z:
+  Reg = &WorkGroupIDZ;
+  RC = &AMDGPU::SReg_32RegClass;
+  Ty = LLT::sca

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79560
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: d9e26c223bdb124e1e83b73d87d7a545925fda90 
7cfa0c1b7c82135ee5de1555c72d9629e5556c69

https://github.com/llvm/llvm-project/pull/79560
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 7cfa0c1 - [TableGen] Add predicates for immediates comparison (#76004)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

Author: Wang Pengcheng
Date: 2024-01-26T21:28:00-08:00
New Revision: 7cfa0c1b7c82135ee5de1555c72d9629e5556c69

URL: 
https://github.com/llvm/llvm-project/commit/7cfa0c1b7c82135ee5de1555c72d9629e5556c69
DIFF: 
https://github.com/llvm/llvm-project/commit/7cfa0c1b7c82135ee5de1555c72d9629e5556c69.diff

LOG: [TableGen] Add predicates for immediates comparison (#76004)

These predicates can be used to represent `<`, `<=`, `>`, `>=`.

And a predicate for `in range` is added.

(cherry picked from commit 664a0faac464708fc061d12e5cd492fcbfea979a)

Added: 


Modified: 
llvm/include/llvm/Target/TargetInstrPredicate.td
llvm/utils/TableGen/PredicateExpander.cpp
llvm/utils/TableGen/PredicateExpander.h

Removed: 




diff  --git a/llvm/include/llvm/Target/TargetInstrPredicate.td 
b/llvm/include/llvm/Target/TargetInstrPredicate.td
index 82c4c7b23a49b6a..b5419cb9f3867f0 100644
--- a/llvm/include/llvm/Target/TargetInstrPredicate.td
+++ b/llvm/include/llvm/Target/TargetInstrPredicate.td
@@ -152,6 +152,34 @@ class CheckImmOperand_s : 
CheckOperandBase {
   string ImmVal = Value;
 }
 
+// Check that the operand at position `Index` is less than `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandLT : CheckOperandBase {
+  int ImmVal = Imm;
+}
+
+// Check that the operand at position `Index` is greater than `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandGT : CheckOperandBase {
+  int ImmVal = Imm;
+}
+
+// Check that the operand at position `Index` is less than or equal to `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandLE : 
CheckNot>;
+
+// Check that the operand at position `Index` is greater than or equal to 
`Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandGE : 
CheckNot>;
+
 // Expands to a call to `FunctionMapper` if field `FunctionMapper` is set.
 // Otherwise, it expands to a CheckNot>.
 class CheckRegOperandSimple : CheckOperandBase;
@@ -203,6 +231,12 @@ class CheckAll Sequence>
 class CheckAny Sequence>
 : CheckPredicateSequence;
 
+// Check that the operand at position `Index` is in range [Start, End].
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against range [Start, End].
+class CheckImmOperandRange
+  : CheckAll<[CheckImmOperandGE, CheckImmOperandLE]>;
 
 // Used to expand the body of a function predicate. See the definition of
 // TIIPredicate below.

diff  --git a/llvm/utils/TableGen/PredicateExpander.cpp 
b/llvm/utils/TableGen/PredicateExpander.cpp
index d3a73e02cd916f8..0b9b6389fe38171 100644
--- a/llvm/utils/TableGen/PredicateExpander.cpp
+++ b/llvm/utils/TableGen/PredicateExpander.cpp
@@ -59,6 +59,30 @@ void 
PredicateExpander::expandCheckImmOperandSimple(raw_ostream &OS,
 OS << ")";
 }
 
+void PredicateExpander::expandCheckImmOperandLT(raw_ostream &OS, int OpIndex,
+int ImmVal,
+StringRef FunctionMapper) {
+  if (!FunctionMapper.empty())
+OS << FunctionMapper << "(";
+  OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex
+ << ").getImm()";
+  if (!FunctionMapper.empty())
+OS << ")";
+  OS << (shouldNegate() ? " >= " : " < ") << ImmVal;
+}
+
+void PredicateExpander::expandCheckImmOperandGT(raw_ostream &OS, int OpIndex,
+int ImmVal,
+StringRef FunctionMapper) {
+  if (!FunctionMapper.empty())
+OS << FunctionMapper << "(";
+  OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex
+ << ").getImm()";
+  if (!FunctionMapper.empty())
+OS << ")";
+  OS << (shouldNegate() ? " <= " : " > ") << ImmVal;
+}
+
 void PredicateExpander::expandCheckRegOperand(raw_ostream &OS, int OpIndex,
   const Record *Reg,
   StringRef FunctionMapper) {
@@ -352,6 +376,16 @@ void PredicateExpander::expandPredicate(raw_ostream &OS, 
const Record *Rec) {
  Rec->getValueAsString("ImmVal"),
  Rec->getValueAsString("FunctionMapper"));
 
+  if (Rec->isSubClassOf("CheckImmOperandLT"))
+return expandCheckImm

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)

2024-01-26 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/79673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)

2024-01-26 Thread Fangrui Song via llvm-branch-commits

MaskRay wrote:

LGTM

https://github.com/llvm/llvm-project/pull/79673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79355 (PR #79361)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79355

---
Full diff: https://github.com/llvm/llvm-project/pull/79361.diff


2 Files Affected:

- (modified) clang/lib/AST/TemplateBase.cpp (+2-1) 
- (modified) clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp (+18) 


``diff
diff --git a/clang/lib/AST/TemplateBase.cpp b/clang/lib/AST/TemplateBase.cpp
index 2bdbeb08ef20465..3310d7dc24c59d2 100644
--- a/clang/lib/AST/TemplateBase.cpp
+++ b/clang/lib/AST/TemplateBase.cpp
@@ -450,7 +450,8 @@ bool TemplateArgument::structurallyEquals(const 
TemplateArgument &Other) const {
getAsIntegral() == Other.getAsIntegral();
 
   case StructuralValue: {
-if (getStructuralValueType() != Other.getStructuralValueType())
+if (getStructuralValueType().getCanonicalType() !=
+Other.getStructuralValueType().getCanonicalType())
   return false;
 
 llvm::FoldingSetNodeID A, B;
diff --git a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp 
b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp
index b5b8cadc909ce00..834174cdf6a32dc 100644
--- a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp
+++ b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp
@@ -336,3 +336,21 @@ template void bar(B b) {
   (b.operator Tbar(), ...);
 }
 }
+
+namespace ReportedRegression1 {
+  const char kt[] = "dummy";
+
+  template 
+class SomeTempl { };
+
+  template 
+class SomeTempl {
+  public:
+int exit_code() const { return 0; }
+};
+
+  int use() {
+SomeTempl dummy;
+return dummy.exit_code();
+  }
+}

``




https://github.com/llvm/llvm-project/pull/79361
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79355 (PR #79361)

2024-01-26 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/79361

>From ffe93d6f5ca33044d4118dc589571fadf7e2fc81 Mon Sep 17 00:00:00 2001
From: erichkeane 
Date: Wed, 24 Jan 2024 12:07:22 -0800
Subject: [PATCH] Fix comparison of Structural Values

Fixes a regression from #78041 as reported in the review.  The original
patch failed to compare the canonical type, which this adds.  A slightly
modified test of the original report is added.

(cherry picked from commit e3ee3762304aa81e4a240500844bfdd003401b36)
---
 clang/lib/AST/TemplateBase.cpp |  3 ++-
 .../SemaTemplate/temp_arg_nontype_cxx20.cpp| 18 ++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/clang/lib/AST/TemplateBase.cpp b/clang/lib/AST/TemplateBase.cpp
index 2bdbeb08ef2046..3310d7dc24c59d 100644
--- a/clang/lib/AST/TemplateBase.cpp
+++ b/clang/lib/AST/TemplateBase.cpp
@@ -450,7 +450,8 @@ bool TemplateArgument::structurallyEquals(const 
TemplateArgument &Other) const {
getAsIntegral() == Other.getAsIntegral();
 
   case StructuralValue: {
-if (getStructuralValueType() != Other.getStructuralValueType())
+if (getStructuralValueType().getCanonicalType() !=
+Other.getStructuralValueType().getCanonicalType())
   return false;
 
 llvm::FoldingSetNodeID A, B;
diff --git a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp 
b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp
index b5b8cadc909ce0..834174cdf6a32d 100644
--- a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp
+++ b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp
@@ -336,3 +336,21 @@ template void bar(B b) {
   (b.operator Tbar(), ...);
 }
 }
+
+namespace ReportedRegression1 {
+  const char kt[] = "dummy";
+
+  template 
+class SomeTempl { };
+
+  template 
+class SomeTempl {
+  public:
+int exit_code() const { return 0; }
+};
+
+  int use() {
+SomeTempl dummy;
+return dummy.exit_code();
+  }
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [llvm] [clang] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79461
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [clang] [llvm] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: ed48280f8e9dd0fc2a21c3ee24c8db3fe12702f8

https://github.com/llvm/llvm-project/pull/79461
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79277 (PR #79340)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-mc

@llvm/pr-subscribers-clang

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79277

---

Patch is 116.08 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79340.diff


34 Files Affected:

- (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (+2-2) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+18-17) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPU.td (+1) 
- (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+29-2) 
- (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+26) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+10) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+5-2) 
- (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+11) 
- (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+3) 
- (modified) llvm/lib/Target/AMDGPU/VOP1Instructions.td (+88-5) 
- (modified) llvm/lib/Target/AMDGPU/VOP3Instructions.td (+49-4) 
- (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+20-9) 
- (modified) llvm/lib/TargetParser/TargetParser.cpp (+1) 
- (added) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.dpp.ll (+142) 
- (added) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.dpp.mir (+197) 
- (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.ll (+405-61) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1.s (+45) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_dpp16.s (+12) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_dpp8.s (+12) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3.s (+36) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_dpp16.s (+108) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_dpp8.s (+48) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1.s (+138) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1_dpp16.s (+12) 
- (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1_dpp8.s (+12) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1.txt (+36) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp16.txt (+12) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp8.txt (+12) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3.txt (+36) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp16.txt (+108) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp8.txt (+48) 
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1.txt 
(+36) 
- (modified) 
llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1_dpp16.txt (+12) 
- (modified) 
llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1_dpp8.txt (+12) 


``diff
diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl 
b/clang/test/CodeGenOpenCL/amdgpu-features.cl
index 1ba2b129f6895ae..9c8ca0bb96f6125 100644
--- a/clang/test/CodeGenOpenCL/amdgpu-features.cl
+++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl
@@ -100,8 +100,8 @@
 // GFX1103: 
"target-features"="+16-bit-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot10-insts,+dot5-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32"
 // GFX1150: 
"target-features"="+16-bit-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot10-insts,+dot5-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32"
 // GFX1151: 
"target-features"="+16-bit-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot10-insts,+dot5-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32"
-// GFX1200: 
"target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot10-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx12-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32"
-// GFX1201: 
"target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot10-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx12-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32"
+// GFX1200: 
"target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot10-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+fp8-conversion-insts,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx12-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32"
+// GFX1201: 
"target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-ins

[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: aa4cb0e313a621bb9c2e5849de82684ed3c73c29

https://github.com/llvm/llvm-project/pull/79633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] aa4cb0e - [Driver, CodeGen] Support -mtls-dialect= (#79256)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

Author: Fangrui Song
Date: 2024-01-26T19:51:03-08:00
New Revision: aa4cb0e313a621bb9c2e5849de82684ed3c73c29

URL: 
https://github.com/llvm/llvm-project/commit/aa4cb0e313a621bb9c2e5849de82684ed3c73c29
DIFF: 
https://github.com/llvm/llvm-project/commit/aa4cb0e313a621bb9c2e5849de82684ed3c73c29.diff

LOG: [Driver,CodeGen] Support -mtls-dialect= (#79256)

GCC supports -mtls-dialect= for several architectures to select TLSDESC.
This patch supports the following values

* x86: "gnu". "gnu2" (TLSDESC) is not supported yet.
* RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915)

AArch64 toolchains seem to support TLSDESC from the beginning, and the
general dynamic model has poor support. Nobody seems to use the option
-mtls-dialect= at all, so we don't bother with it.
There also seems very little interest in AArch32's TLSDESC support.

TLSDESC does not change IR, but affects object file generation. Without
a backend option the option is a no-op for in-process ThinLTO.

There seems no motivation to have fine-grained control mixing trad/desc
for TLS, so we just pass -mllvm, and don't bother with a modules flag
metadata or function attribute.

Co-authored-by: Paul Kirth 
(cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20)

Added: 
clang/test/CodeGen/RISCV/tls-dialect.c
clang/test/Driver/tls-dialect.c

Modified: 
clang/include/clang/Basic/CodeGenOptions.def
clang/include/clang/Driver/Options.td
clang/lib/CodeGen/BackendUtil.cpp
clang/lib/Driver/ToolChains/Clang.cpp
clang/lib/Driver/ToolChains/CommonArgs.cpp
clang/lib/Driver/ToolChains/CommonArgs.h
llvm/include/llvm/TargetParser/Triple.h

Removed: 




diff  --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 2f2e45d5cf63df..7c0bfe32849614 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, 
llvm::driver::VectorLibr
 /// The default TLS model to use.
 ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel)
 
+/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this 
value.
+CODEGENOPT(EnableTLSDESC, 1, 0)
+
 /// Bit size of immediate TLS offsets (0 == use the default).
 VALUE_CODEGENOPT(TLSSize, 8, 0)
 

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f4fa33748faca..773bc1dcda01d5 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, 
Group,
   HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): "
"12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 
256TB, needs -mcmodel=large)">,
   MarshallingInfoInt>;
+def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group,
+  Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use 
for dynamic accesses of TLS variables">;
 def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group;
 def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, 
Group;
 def mno_default_build_attributes : Joined<["-"], 
"mno-default-build-attributes">, Group;
@@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], 
"fexperimental-assignme
   Values<"disabled,enabled,forced">, 
NormalizedValues<["Disabled","Enabled","Forced"]>,
   MarshallingInfoEnum, "Enabled">;
 
+def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
+  MarshallingInfoFlag>;
+
 } // let Visibility = [CC1Option]
 
 
//===--===//

diff  --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index ec203f6f28bc17..7877e20d77f772 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.UniqueBasicBlockSectionNames =
   CodeGenOpts.UniqueBasicBlockSectionNames;
   Options.TLSSize = CodeGenOpts.TLSSize;
+  Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC;
   Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;
   Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
   Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection;

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 5dc614e11aab59..8092fc050b0ee6 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ);
   }
 
+  if (isTLSDESCEnabled(TC, Args))
+CmdArgs.push_back("-enable-tlsdesc");
+
   // Add the target cpu
   std::string CPU = getCPUName(D, Args, Triple, /*FromAs*/

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#79629

---
Full diff: https://github.com/llvm/llvm-project/pull/79673.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/X86AsmPrinter.cpp (-1) 
- (added) llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll (+30) 


``diff
diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp 
b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index 9f0fd4d0938e97f..87ec8aa23080e00 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -877,7 +877,6 @@ void X86AsmPrinter::emitStartOfAsmFile(Module &M) {
   OutStreamer->emitInt32(FeatureFlagsAnd);// data
   emitAlignment(WordSize == 4 ? Align(4) : Align(8)); // padding
 
-  OutStreamer->endSection(Nt);
   OutStreamer->switchSection(Cur);
 }
   }
diff --git a/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll 
b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll
new file mode 100644
index 000..a0e5b4add1b386e
--- /dev/null
+++ b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll
@@ -0,0 +1,30 @@
+; RUN: llc -mtriple x86_64-unknown-linux-gnu %s -o %t.o -filetype=obj
+; RUN: llvm-readobj -n %t.o | FileCheck %s
+
+module asm ".pushsection \22.note.gnu.property\22,\22a\22,@note"
+module asm " .p2align 3"
+module asm " .long 1f - 0f"
+module asm " .long 4f - 1f"
+module asm " .long 5"
+module asm "0:   .asciz \22GNU\22"
+module asm "1:   .p2align 3"
+module asm " .long 0xc0008002"
+module asm " .long 3f - 2f"
+module asm "2:   .long ((1U << 0) | 0 | 0 | 0)"
+module asm "3:   .p2align 3"
+module asm "4:"
+module asm " .popsection"
+
+!llvm.module.flags = !{!0, !1}
+
+!0 = !{i32 4, !"cf-protection-return", i32 1}
+!1 = !{i32 4, !"cf-protection-branch", i32 1}
+
+; CHECK:  Type: NT_GNU_PROPERTY_TYPE_0
+; CHECK-NEXT: Property [
+; CHECK-NEXT:   x86 feature: IBT, SHSTK
+; CHECK-NEXT: ]
+; CHECK:  Type: NT_GNU_PROPERTY_TYPE_0
+; CHECK-NEXT: Property [
+; CHECK-NEXT:   x86 ISA needed: x86-64-baseline
+; CHECK-NEXT: ]

``




https://github.com/llvm/llvm-project/pull/79673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:

@MaskRay What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)

2024-01-26 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/79673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)

2024-01-26 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/79673

resolves llvm/llvm-project#79629

>From ee4face03101cd785a668bd1cb78e2332cc5b248 Mon Sep 17 00:00:00 2001
From: Adhemerval Zanella 
Date: Fri, 26 Jan 2024 10:33:47 -0800
Subject: [PATCH] [X86] Do not end 'note.gnu.property' section with
 -fcf-protection (#79360)

The glibc now adds the required minimum ISA level for libc-nonshared.a
(linked on all programs) and this is done with an inline asm along with
.note.gnu.property and .pushsection/.popsection. However, the x86
backend always ends the 'note.gnu.property' section when building with
-fcf-protection, leading to assert failure:

llvm/llvm-project-git/llvm/lib/MC/MCStreamer.cpp:1251: virtual void
llvm::MCStreamer::switchSection(llvm::MCSection*, const llvm::MCExpr*):
Assertion `!Section->hasEnded() && "Section already ended"' failed.

[1]
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/isa-level.c;h=3f1b269848a52f994275bab6f60dded3ded6b144;hb=HEAD

(cherry picked from commit a58c62fa824fd24d20fa2366e0ec8f241cb321fe)
---
 llvm/lib/Target/X86/X86AsmPrinter.cpp |  1 -
 .../X86/note-cet-property-inlineasm.ll| 30 +++
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll

diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp 
b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index 9f0fd4d0938e97f..87ec8aa23080e00 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -877,7 +877,6 @@ void X86AsmPrinter::emitStartOfAsmFile(Module &M) {
   OutStreamer->emitInt32(FeatureFlagsAnd);// data
   emitAlignment(WordSize == 4 ? Align(4) : Align(8)); // padding
 
-  OutStreamer->endSection(Nt);
   OutStreamer->switchSection(Cur);
 }
   }
diff --git a/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll 
b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll
new file mode 100644
index 000..a0e5b4add1b386e
--- /dev/null
+++ b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll
@@ -0,0 +1,30 @@
+; RUN: llc -mtriple x86_64-unknown-linux-gnu %s -o %t.o -filetype=obj
+; RUN: llvm-readobj -n %t.o | FileCheck %s
+
+module asm ".pushsection \22.note.gnu.property\22,\22a\22,@note"
+module asm " .p2align 3"
+module asm " .long 1f - 0f"
+module asm " .long 4f - 1f"
+module asm " .long 5"
+module asm "0:   .asciz \22GNU\22"
+module asm "1:   .p2align 3"
+module asm " .long 0xc0008002"
+module asm " .long 3f - 2f"
+module asm "2:   .long ((1U << 0) | 0 | 0 | 0)"
+module asm "3:   .p2align 3"
+module asm "4:"
+module asm " .popsection"
+
+!llvm.module.flags = !{!0, !1}
+
+!0 = !{i32 4, !"cf-protection-return", i32 1}
+!1 = !{i32 4, !"cf-protection-branch", i32 1}
+
+; CHECK:  Type: NT_GNU_PROPERTY_TYPE_0
+; CHECK-NEXT: Property [
+; CHECK-NEXT:   x86 feature: IBT, SHSTK
+; CHECK-NEXT: ]
+; CHECK:  Type: NT_GNU_PROPERTY_TYPE_0
+; CHECK-NEXT: Property [
+; CHECK-NEXT:   x86 ISA needed: x86-64-baseline
+; CHECK-NEXT: ]

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79511 (PR #79513)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79511

---
Full diff: https://github.com/llvm/llvm-project/pull/79513.diff


2 Files Affected:

- (modified) clang/lib/Driver/Driver.cpp (+3-3) 
- (modified) clang/test/Driver/fat-lto-objects.c (+9-3) 


``diff
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 7109faa1072de5f..93cddf742d521d2 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -4764,9 +4764,9 @@ Action *Driver::ConstructPhaseAction(
   case phases::Backend: {
 if (isUsingLTO() && TargetDeviceOffloadKind == Action::OFK_None) {
   types::ID Output;
-  if (Args.hasArg(options::OPT_ffat_lto_objects))
-Output = Args.hasArg(options::OPT_emit_llvm) ? types::TY_LTO_IR
- : types::TY_PP_Asm;
+  if (Args.hasArg(options::OPT_ffat_lto_objects) &&
+  !Args.hasArg(options::OPT_emit_llvm))
+Output = types::TY_PP_Asm;
   else if (Args.hasArg(options::OPT_S))
 Output = types::TY_LTO_IR;
   else
diff --git a/clang/test/Driver/fat-lto-objects.c 
b/clang/test/Driver/fat-lto-objects.c
index 97002db6edc51e5..d9a5ba88ea6d6f5 100644
--- a/clang/test/Driver/fat-lto-objects.c
+++ b/clang/test/Driver/fat-lto-objects.c
@@ -23,11 +23,17 @@
 // CHECK-CC-S-EL-LTO-SAME: -emit-llvm
 // CHECK-CC-S-EL-LTO-SAME: -ffat-lto-objects
 
-/// When fat LTO is enabled wihtout -S we expect native object output and 
-ffat-lto-object to be passed to cc1.
+/// When fat LTO is enabled without -S we expect native object output and 
-ffat-lto-object to be passed to cc1.
 // RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### 
%s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-LTO
 // CHECK-CC-C-LTO: -cc1
-// CHECK-CC-C-LTO: -emit-obj
-// CHECK-CC-C-LTO: -ffat-lto-objects
+// CHECK-CC-C-LTO-SAME: -emit-obj
+// CHECK-CC-C-LTO-SAME: -ffat-lto-objects
+
+/// When fat LTO is enabled with -c and -emit-llvm we expect bitcode output 
and -ffat-lto-object to be passed to cc1.
+// RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### 
%s -c -emit-llvm 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-EL-LTO
+// CHECK-CC-C-EL-LTO: -cc1
+// CHECK-CC-C-EL-LTO-SAME: -emit-llvm-bc
+// CHECK-CC-C-EL-LTO-SAME: -ffat-lto-objects
 
 /// Make sure we don't have a warning for -ffat-lto-objects being unused
 // RUN: %clang --target=x86_64-unknown-linux-gnu -ffat-lto-objects 
-fdriver-only -Werror -v %s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-NOLTO

``




https://github.com/llvm/llvm-project/pull/79513
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79511 (PR #79513)

2024-01-26 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/79513

>From 3c527f29181e349d054bdf39807a97df2e85ef02 Mon Sep 17 00:00:00 2001
From: Sean Fertile <35576261+mandle...@users.noreply.github.com>
Date: Thu, 25 Jan 2024 10:50:59 -0500
Subject: [PATCH] [LTO] Fix fat-lto output for -c -emit-llvm. (#79404)

Fix and add a test case for combining '-ffat-lto-objects -c -emit-llvm'
options and fix a spelling mistake in same test.

(cherry picked from commit f1b1611148fa533fe198fec3fa4ef8139224dc80)
---
 clang/lib/Driver/Driver.cpp |  6 +++---
 clang/test/Driver/fat-lto-objects.c | 12 +---
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 7109faa1072de5f..93cddf742d521d2 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -4764,9 +4764,9 @@ Action *Driver::ConstructPhaseAction(
   case phases::Backend: {
 if (isUsingLTO() && TargetDeviceOffloadKind == Action::OFK_None) {
   types::ID Output;
-  if (Args.hasArg(options::OPT_ffat_lto_objects))
-Output = Args.hasArg(options::OPT_emit_llvm) ? types::TY_LTO_IR
- : types::TY_PP_Asm;
+  if (Args.hasArg(options::OPT_ffat_lto_objects) &&
+  !Args.hasArg(options::OPT_emit_llvm))
+Output = types::TY_PP_Asm;
   else if (Args.hasArg(options::OPT_S))
 Output = types::TY_LTO_IR;
   else
diff --git a/clang/test/Driver/fat-lto-objects.c 
b/clang/test/Driver/fat-lto-objects.c
index 97002db6edc51e5..d9a5ba88ea6d6f5 100644
--- a/clang/test/Driver/fat-lto-objects.c
+++ b/clang/test/Driver/fat-lto-objects.c
@@ -23,11 +23,17 @@
 // CHECK-CC-S-EL-LTO-SAME: -emit-llvm
 // CHECK-CC-S-EL-LTO-SAME: -ffat-lto-objects
 
-/// When fat LTO is enabled wihtout -S we expect native object output and 
-ffat-lto-object to be passed to cc1.
+/// When fat LTO is enabled without -S we expect native object output and 
-ffat-lto-object to be passed to cc1.
 // RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### 
%s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-LTO
 // CHECK-CC-C-LTO: -cc1
-// CHECK-CC-C-LTO: -emit-obj
-// CHECK-CC-C-LTO: -ffat-lto-objects
+// CHECK-CC-C-LTO-SAME: -emit-obj
+// CHECK-CC-C-LTO-SAME: -ffat-lto-objects
+
+/// When fat LTO is enabled with -c and -emit-llvm we expect bitcode output 
and -ffat-lto-object to be passed to cc1.
+// RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### 
%s -c -emit-llvm 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-EL-LTO
+// CHECK-CC-C-EL-LTO: -cc1
+// CHECK-CC-C-EL-LTO-SAME: -emit-llvm-bc
+// CHECK-CC-C-EL-LTO-SAME: -ffat-lto-objects
 
 /// Make sure we don't have a warning for -ffat-lto-objects being unused
 // RUN: %clang --target=x86_64-unknown-linux-gnu -ffat-lto-objects 
-fdriver-only -Werror -v %s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-NOLTO

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (PR #79642)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79642
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 147c623 - Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (#79642)

2024-01-26 Thread via llvm-branch-commits

Author: dyung
Date: 2024-01-26T19:27:28-08:00
New Revision: 147c623a86b39d6bc9993293487b5773de943dad

URL: 
https://github.com/llvm/llvm-project/commit/147c623a86b39d6bc9993293487b5773de943dad
DIFF: 
https://github.com/llvm/llvm-project/commit/147c623a86b39d6bc9993293487b5773de943dad.diff

LOG: Change check for embedded llvm version number to a regex to make test more 
flexible. (#79528) (#79642)

This test started to fail when LLVM created the release/18.x branch and
the main branch subsequently had the version number increased from 18 to
19.

I investigated this failure (it was blocking our internal automation)
and discovered that the CHECK statement on line 27 seemed to have the
compiler version number (1800) encoded in octal that it was checking
for. I don't know if this is something that explicitly needs to be
checked, so I am leaving it in, but it should be more flexible so the
test doesn't fail anytime the version number is changed. To accomplish
that, I changed the check for the 4-digit version number to be a regex.

I originally updated this test for the 18->19 transition in
a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK
line more flexible so it doesn't need to be continually updated.

(cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874)

Added: 


Modified: 
llvm/test/CodeGen/SystemZ/zos-ppa2.ll

Removed: 




diff  --git a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll 
b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll
index f54f654b804a239..60580aeb6d83cc7 100644
--- a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll
+++ b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll
@@ -24,7 +24,7 @@
 ; CHECK:.byte   0
 ; CHECK:.byte   3
 ; CHECK:.short  30
-; CHECK:.ascii  
"\323\323\345\324@@\361\370\360\360\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360"
+; CHECK:.ascii  
"\323\323\345\324@@{{((\\3[0-7]{2}){4})}}\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360"
 define void @void_test() {
 entry:
   ret void



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [llvm] [ThinLTO][TypeProf] Implement vtable def import (PR #79381)

2024-01-26 Thread Mingming Liu via llvm-branch-commits


@@ -1285,46 +1285,44 @@ void annotateValueSite(Module &M, Instruction &Inst,
   Inst.setMetadata(LLVMContext::MD_prof, MDNode::get(Ctx, Vals));
 }
 
-bool getValueProfDataFromInst(const Instruction &Inst,
-  InstrProfValueKind ValueKind,
-  uint32_t MaxNumValueData,
-  InstrProfValueData ValueData[],
-  uint32_t &ActualNumValueData, uint64_t &TotalC,
-  bool GetNoICPValue) {
+MDNode *mayHaveValueProfileOfKind(const Instruction &Inst,
+  InstrProfValueKind ValueKind) {
   MDNode *MD = Inst.getMetadata(LLVMContext::MD_prof);
   if (!MD)
-return false;
+return nullptr;
 
-  unsigned NOps = MD->getNumOperands();
+  if (MD->getNumOperands() < 5)
+return nullptr;
 
-  if (NOps < 5)
-return false;
-
-  // Operand 0 is a string tag "VP":
   MDString *Tag = cast(MD->getOperand(0));
-  if (!Tag)
-return false;
-
-  if (!Tag->getString().equals("VP"))
-return false;
+  if (!Tag || !Tag->getString().equals("VP"))
+return nullptr;
 
   // Now check kind:
   ConstantInt *KindInt = mdconst::dyn_extract(MD->getOperand(1));
   if (!KindInt)
-return false;
+return nullptr;
   if (KindInt->getZExtValue() != ValueKind)
-return false;
+return nullptr;
+
+  return MD;
+}
 
+static bool getValueProfDataFromInst(const MDNode *const MD,

minglotus-6 wrote:

sounds good! Done.

https://github.com/llvm/llvm-project/pull/79381
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [llvm] [ThinLTO][TypeProf] Implement vtable def import (PR #79381)

2024-01-26 Thread Mingming Liu via llvm-branch-commits

https://github.com/minglotus-6 updated 
https://github.com/llvm/llvm-project/pull/79381

>From d4caa0997799b712edb11d90c5be79d0aab3c312 Mon Sep 17 00:00:00 2001
From: mingmingl 
Date: Thu, 25 Jan 2024 13:59:03 -0800
Subject: [PATCH 1/4] Introduce an opton to control the number of vtables to
 annotate and use it when generating function summaries.

Created using spr 1.3.4
---
 .../IndirectCallPromotionAnalysis.cpp |  3 +++
 llvm/lib/Analysis/ModuleSummaryAnalysis.cpp   | 12 -
 .../thinlto-func-summary-vtableref-pgo.ll | 25 ---
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp 
b/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp
index ebfa1c8fc08e1c..18cb6a220e3bd0 100644
--- a/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp
+++ b/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp
@@ -45,6 +45,9 @@ static cl::opt
  cl::desc("Max number of promotions for a single indirect "
   "call callsite"));
 
+cl::opt MaxNumVTableAnnotations("icp-max-num-vtables", cl::init(6), 
cl::Hidden,
+   cl::desc("Max number of vtables annotated for a vtable load instruction."));
+
 ICallPromotionAnalysis::ICallPromotionAnalysis() {
   ValueDataArray = std::make_unique(MaxNumPromotions);
 }
diff --git a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp 
b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
index fc8c31de0f4501..0f0085025cc56b 100644
--- a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
+++ b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
@@ -82,6 +82,8 @@ static cl::opt ModuleSummaryDotFile(
 
 extern cl::opt ScalePartialSampleProfileWorkingSetSize;
 
+extern cl::opt MaxNumVTableAnnotations;
+
 // Walk through the operands of a given User via worklist iteration and 
populate
 // the set of GlobalValue references encountered. Invoked either on an
 // Instruction or a GlobalVariable (which walks its initializer).
@@ -129,14 +131,10 @@ static bool findRefEdges(ModuleSummaryIndex &Index, const 
User *CurUser,
   if (I) {
 uint32_t ActualNumValueData = 0;
 uint64_t TotalCount = 0;
-// 24 is the maximum number of values preserved for one instrumented site,
-// defined by INSTR_PROF_DEFAULT_NUM_VAL_PER_SITE in
-// compiler-rt/lib/profile/InstrProfilingValue.c; passing 24 as
-// `MaxNumValueData` controls the max number of elements in the returned
-// array. The actual number of values is gated by the number of ops in 
!prof
-// metadata.
+// MaxNumVTableAnnotations is the maximum number of vtables annotated on
+// the instruction.
 auto ValueDataArray = getValueProfDataFromInst(
-*I, IPVK_VTableTarget, 24 /* MaxNumValueData */, ActualNumValueData,
+*I, IPVK_VTableTarget, MaxNumVTableAnnotations /* MaxNumValueData */, 
ActualNumValueData,
 TotalCount);
 
 if (ValueDataArray.get()) {
diff --git a/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll 
b/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll
index 28e4b1d19aef72..ba3ce9a75ee832 100644
--- a/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll
+++ b/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll
@@ -1,4 +1,8 @@
-; RUN: opt -module-summary %s -o %t.o
+; Promote at most one function and annotate at most one vtable.
+; As a result, only one value (of each relevant kind) shows up in the function
+; summary.
+
+; RUN: opt -module-summary -icp-max-num-vtables=1 -icp-max-prom=1 %s -o %t.o
 
 ; RUN: llvm-bcanalyzer -dump %t.o | FileCheck %s
 
@@ -11,15 +15,17 @@
 ; CHECK-NEXT:   
 ; The `VALUE_GUID` below represents the "_ZTV4Base" referenced by the 
instruction
 ; that loads vtable pointers.
-; CHECK-NEXT:   
+; CHECK-NEXT: 
 ; The `VALUE_GUID` below represents the "_ZN4Base4funcEv" referenced by the
 ; indirect call instruction.
-; CHECK-NEXT:   
+; CHECK-NEXT:  
+; NOTE vtables and functions from Derived class is dropped because
+; `-icp-max-num-vtables` and `-icp-max-prom` are both set to one.
 ;  has the format [valueid, flags, instcount, funcflags,
 ; numrefs, rorefcnt, worefcnt,
 ; m x valueid,
 ; n x (valueid, hotness+tailcall)]
-; CHECK-NEXT:   
+; CHECK-NEXT:   
 ; CHECK-NEXT:  
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
@@ -36,7 +42,6 @@ define i32 @_Z4testP4Base(ptr %0) !prof !15 {
 
 !llvm.module.flags = !{!1}
 
-
 !1 = !{i32 1, !"ProfileSummary", !2}
 !2 = !{!3, !4, !5, !6, !7, !8, !9, !10}
 !3 = !{!"ProfileFormat", !"InstrProf"}
@@ -53,10 +58,12 @@ define i32 @_Z4testP4Base(ptr %0) !prof !15 {
 !14 = !{i32 99, i64 1, i32 2}
 
 !15 = !{!"function_entry_count", i32 150}
-; 1960855528937986108 is the MD5 hash of _ZTV4Base
-!16 = !{!"VP", i32 2, i64 1600, i64 1960855528937986108, i64 1600}
-; 5459407273543877811 is the MD5 hash of _ZN4Base4funcEv
-!17 = !{!"VP", i32 0, i64 1600, i64 5459407273543877811

[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)

2024-01-26 Thread via llvm-branch-commits

https://github.com/yozhu approved this pull request.


https://github.com/llvm/llvm-project/pull/79641
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (PR #79642)

2024-01-26 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar milestoned 
https://github.com/llvm/llvm-project/pull/79642
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (PR #79642)

2024-01-26 Thread via llvm-branch-commits

https://github.com/dyung created https://github.com/llvm/llvm-project/pull/79642

This test started to fail when LLVM created the release/18.x branch and the 
main branch subsequently had the version number increased from 18 to 19.

I investigated this failure (it was blocking our internal automation) and 
discovered that the CHECK statement on line 27 seemed to have the compiler 
version number (1800) encoded in octal that it was checking for. I don't know 
if this is something that explicitly needs to be checked, so I am leaving it 
in, but it should be more flexible so the test doesn't fail anytime the version 
number is changed. To accomplish that, I changed the check for the 4-digit 
version number to be a regex.

I originally updated this test for the 18->19 transition in 
a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK line more 
flexible so it doesn't need to be continually updated.

(cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874)

>From d787fef6a16dfc7e3b06bec5fc4c9e2d22180d5a Mon Sep 17 00:00:00 2001
From: dyung 
Date: Fri, 26 Jan 2024 09:36:20 -0800
Subject: [PATCH] Change check for embedded llvm version number to a regex to
 make test more flexible. (#79528)

This test started to fail when LLVM created the release/18.x branch and
the main branch subsequently had the version number increased from 18 to
19.

I investigated this failure (it was blocking our internal automation)
and discovered that the CHECK statement on line 27 seemed to have the
compiler version number (1800) encoded in octal that it was checking
for. I don't know if this is something that explicitly needs to be
checked, so I am leaving it in, but it should be more flexible so the
test doesn't fail anytime the version number is changed. To accomplish
that, I changed the check for the 4-digit version number to be a regex.

I originally updated this test for the 18->19 transition in
a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK
line more flexible so it doesn't need to be continually updated.

(cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874)
---
 llvm/test/CodeGen/SystemZ/zos-ppa2.ll | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll 
b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll
index f54f654b804a239..60580aeb6d83cc7 100644
--- a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll
+++ b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll
@@ -24,7 +24,7 @@
 ; CHECK:.byte   0
 ; CHECK:.byte   3
 ; CHECK:.short  30
-; CHECK:.ascii  
"\323\323\345\324@@\361\370\360\360\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360"
+; CHECK:.ascii  
"\323\323\345\324@@{{((\\3[0-7]{2}){4})}}\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360"
 define void @void_test() {
 entry:
   ret void

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-arm

Author: Oskar Wirga (oskarwirga)


Changes

I would like to backport these changes to the 18.x branch because that is where 
stack-clash-protection was first introduced and this patch fixes a subtle 
register allocation bug that occurs due to incorrect ordering of 
recomputeLiveIns. 

Currently, the way that recomputeLiveIns works is that it will recompute the 
livein registers for that MachineBasicBlock but it matters what order you call 
recomputeLiveIn which can result in incorrect register allocations down the 
line.

This PR fixes that by simply recomputing the liveins for the entire CFG until 
convergence is achieved. This makes it harder to introduce subtle bugs which 
alter liveness.

(cherry picked from commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b)

---

Patch is 297.53 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79641.diff


30 Files Affected:

- (modified) llvm/include/llvm/CodeGen/LivePhysRegs.h (+23-2) 
- (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+6) 
- (modified) llvm/lib/CodeGen/BranchFolding.cpp (+1-2) 
- (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+1-2) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-3) 
- (modified) llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (+1-6) 
- (modified) llvm/lib/Target/PowerPC/PPCExpandAtomicPseudoInsts.cpp (+2-5) 
- (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+2-4) 
- (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+2-4) 
- (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+2-6) 
- (modified) llvm/test/CodeGen/AArch64/stack-probing-last-in-block.mir (+1-1) 
- (modified) llvm/test/CodeGen/SystemZ/branch-folder-hoist-livein.mir (+25-11) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-default.mir (+102-90) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize-strd-lr.mir 
(+89-78) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize.mir (+96-87) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-mov.mir 
(+87-74) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-chain.mir 
(+138-120) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-prev-iteration.mir 
(+155-130) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-liveout.mir 
(+154-129) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix-debug.mir 
(+80-70) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix.mir (+173-144) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dlstp.mir 
(+58-56) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/multi-block-cond-iter-count.mir 
(+132-112) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiple-do-loops.mir 
(+233-193) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/predicated-liveout.mir 
(+38-29) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/remove-elem-moves.mir 
(+97-79) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-debug.mir (+65-56) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir 
(+71-68) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/spillingmove.mir (+7-7) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unrolled-and-vector.mir 
(+154-130) 


``diff
diff --git a/llvm/include/llvm/CodeGen/LivePhysRegs.h 
b/llvm/include/llvm/CodeGen/LivePhysRegs.h
index 76bb34d270a26dc..1e0ee9eb9eb3203 100644
--- a/llvm/include/llvm/CodeGen/LivePhysRegs.h
+++ b/llvm/include/llvm/CodeGen/LivePhysRegs.h
@@ -31,6 +31,7 @@
 
 #include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/MC/MCRegister.h"
 #include "llvm/MC/MCRegisterInfo.h"
@@ -193,11 +194,31 @@ void addLiveIns(MachineBasicBlock &MBB, const 
LivePhysRegs &LiveRegs);
 void computeAndAddLiveIns(LivePhysRegs &LiveRegs,
   MachineBasicBlock &MBB);
 
-/// Convenience function for recomputing live-in's for \p MBB.
-static inline void recomputeLiveIns(MachineBasicBlock &MBB) {
+/// Function to update the live-in's for a basic block and return whether any
+/// changes were made.
+static inline bool updateBlockLiveInfo(MachineBasicBlock &MBB) {
   LivePhysRegs LPR;
+  auto oldLiveIns = MBB.getLiveIns();
+
   MBB.clearLiveIns();
   computeAndAddLiveIns(LPR, MBB);
+  MBB.sortUniqueLiveIns();
+
+  auto newLiveIns = MBB.getLiveIns();
+  return oldLiveIns != newLiveIns;
+}
+
+/// Convenience function for recomputing live-in's for the entire CFG until
+/// convergence is reached.
+static inline void recomputeLiveIns(MachineFunction &MF) {
+  bool anyChanged;
+  do {
+anyChanged = false;
+for (auto MFI = MF.rbegin(), MFE = MF.rend(); MFI != MFE; ++MFI) {
+  MachineBasicBlock &MBB = *MFI;
+  anyChanged |= updateBlockLive

[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Oskar Wirga (oskarwirga)


Changes

I would like to backport these changes to the 18.x branch because that is where 
stack-clash-protection was first introduced and this patch fixes a subtle 
register allocation bug that occurs due to incorrect ordering of 
recomputeLiveIns. 

Currently, the way that recomputeLiveIns works is that it will recompute the 
livein registers for that MachineBasicBlock but it matters what order you call 
recomputeLiveIn which can result in incorrect register allocations down the 
line.

This PR fixes that by simply recomputing the liveins for the entire CFG until 
convergence is achieved. This makes it harder to introduce subtle bugs which 
alter liveness.

(cherry picked from commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b)

---

Patch is 297.53 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79641.diff


30 Files Affected:

- (modified) llvm/include/llvm/CodeGen/LivePhysRegs.h (+23-2) 
- (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+6) 
- (modified) llvm/lib/CodeGen/BranchFolding.cpp (+1-2) 
- (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+1-2) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-3) 
- (modified) llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (+1-6) 
- (modified) llvm/lib/Target/PowerPC/PPCExpandAtomicPseudoInsts.cpp (+2-5) 
- (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+2-4) 
- (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+2-4) 
- (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+2-6) 
- (modified) llvm/test/CodeGen/AArch64/stack-probing-last-in-block.mir (+1-1) 
- (modified) llvm/test/CodeGen/SystemZ/branch-folder-hoist-livein.mir (+25-11) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-default.mir (+102-90) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize-strd-lr.mir 
(+89-78) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize.mir (+96-87) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-mov.mir 
(+87-74) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-chain.mir 
(+138-120) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-prev-iteration.mir 
(+155-130) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-liveout.mir 
(+154-129) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix-debug.mir 
(+80-70) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix.mir (+173-144) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dlstp.mir 
(+58-56) 
- (modified) 
llvm/test/CodeGen/Thumb2/LowOverheadLoops/multi-block-cond-iter-count.mir 
(+132-112) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiple-do-loops.mir 
(+233-193) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/predicated-liveout.mir 
(+38-29) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/remove-elem-moves.mir 
(+97-79) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-debug.mir (+65-56) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir 
(+71-68) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/spillingmove.mir (+7-7) 
- (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unrolled-and-vector.mir 
(+154-130) 


``diff
diff --git a/llvm/include/llvm/CodeGen/LivePhysRegs.h 
b/llvm/include/llvm/CodeGen/LivePhysRegs.h
index 76bb34d270a26dc..1e0ee9eb9eb3203 100644
--- a/llvm/include/llvm/CodeGen/LivePhysRegs.h
+++ b/llvm/include/llvm/CodeGen/LivePhysRegs.h
@@ -31,6 +31,7 @@
 
 #include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/MC/MCRegister.h"
 #include "llvm/MC/MCRegisterInfo.h"
@@ -193,11 +194,31 @@ void addLiveIns(MachineBasicBlock &MBB, const 
LivePhysRegs &LiveRegs);
 void computeAndAddLiveIns(LivePhysRegs &LiveRegs,
   MachineBasicBlock &MBB);
 
-/// Convenience function for recomputing live-in's for \p MBB.
-static inline void recomputeLiveIns(MachineBasicBlock &MBB) {
+/// Function to update the live-in's for a basic block and return whether any
+/// changes were made.
+static inline bool updateBlockLiveInfo(MachineBasicBlock &MBB) {
   LivePhysRegs LPR;
+  auto oldLiveIns = MBB.getLiveIns();
+
   MBB.clearLiveIns();
   computeAndAddLiveIns(LPR, MBB);
+  MBB.sortUniqueLiveIns();
+
+  auto newLiveIns = MBB.getLiveIns();
+  return oldLiveIns != newLiveIns;
+}
+
+/// Convenience function for recomputing live-in's for the entire CFG until
+/// convergence is reached.
+static inline void recomputeLiveIns(MachineFunction &MF) {
+  bool anyChanged;
+  do {
+anyChanged = false;
+for (auto MFI = MF.rbegin(), MFE = MF.rend(); MFI != MFE; ++MFI) {
+  MachineBasicBlock &MBB = *MFI;
+  anyChanged |= updateBlock

[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)

2024-01-26 Thread Oskar Wirga via llvm-branch-commits

https://github.com/oskarwirga edited 
https://github.com/llvm/llvm-project/pull/79641
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)

2024-01-26 Thread Oskar Wirga via llvm-branch-commits

https://github.com/oskarwirga milestoned 
https://github.com/llvm/llvm-project/pull/79641
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [lld] [RISCV] Support RISC-V TLSDESC in LLD (PR #77516)

2024-01-26 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi closed 
https://github.com/llvm/llvm-project/pull/77516
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [lld] [RISCV] Support RISC-V TLSDESC in LLD (PR #77516)

2024-01-26 Thread Paul Kirth via llvm-branch-commits

ilovepi wrote:

Abandon in favor of https://github.com/llvm/llvm-project/pull/79239

https://github.com/llvm/llvm-project/pull/77516
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79632

---
Full diff: https://github.com/llvm/llvm-project/pull/79633.diff


9 Files Affected:

- (modified) clang/include/clang/Basic/CodeGenOptions.def (+3) 
- (modified) clang/include/clang/Driver/Options.td (+5) 
- (modified) clang/lib/CodeGen/BackendUtil.cpp (+1) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+3) 
- (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+30) 
- (modified) clang/lib/Driver/ToolChains/CommonArgs.h (+3) 
- (added) clang/test/CodeGen/RISCV/tls-dialect.c (+14) 
- (added) clang/test/Driver/tls-dialect.c (+25) 
- (modified) llvm/include/llvm/TargetParser/Triple.h (+3-3) 


``diff
diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 2f2e45d5cf63dfa..7c0bfe328496147 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, 
llvm::driver::VectorLibr
 /// The default TLS model to use.
 ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel)
 
+/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this 
value.
+CODEGENOPT(EnableTLSDESC, 1, 0)
+
 /// Bit size of immediate TLS offsets (0 == use the default).
 VALUE_CODEGENOPT(TLSSize, 8, 0)
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f4fa33748facaf..773bc1dcda01d5c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, 
Group,
   HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): "
"12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 
256TB, needs -mcmodel=large)">,
   MarshallingInfoInt>;
+def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group,
+  Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use 
for dynamic accesses of TLS variables">;
 def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group;
 def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, 
Group;
 def mno_default_build_attributes : Joined<["-"], 
"mno-default-build-attributes">, Group;
@@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], 
"fexperimental-assignme
   Values<"disabled,enabled,forced">, 
NormalizedValues<["Disabled","Enabled","Forced"]>,
   MarshallingInfoEnum, "Enabled">;
 
+def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
+  MarshallingInfoFlag>;
+
 } // let Visibility = [CC1Option]
 
 
//===--===//
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index ec203f6f28bc173..7877e20d77f7724 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.UniqueBasicBlockSectionNames =
   CodeGenOpts.UniqueBasicBlockSectionNames;
   Options.TLSSize = CodeGenOpts.TLSSize;
+  Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC;
   Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;
   Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
   Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection;
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 5dc614e11aab599..8092fc050b0ee6d 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ);
   }
 
+  if (isTLSDESCEnabled(TC, Args))
+CmdArgs.push_back("-enable-tlsdesc");
+
   // Add the target cpu
   std::string CPU = getCPUName(D, Args, Triple, /*FromAs*/ false);
   if (!CPU.empty()) {
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index fadaf3e60c6616a..ff4047298d70d52 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -729,6 +729,33 @@ bool tools::isUseSeparateSections(const llvm::Triple 
&Triple) {
   return Triple.isPS();
 }
 
+bool tools::isTLSDESCEnabled(const ToolChain &TC,
+ const llvm::opt::ArgList &Args) {
+  const llvm::Triple &Triple = TC.getEffectiveTriple();
+  Arg *A = Args.getLastArg(options::OPT_mtls_dialect_EQ);
+  if (!A)
+return Triple.hasDefaultTLSDESC();
+  StringRef V = A->getValue();
+  bool SupportedArgument = false, EnableTLSDESC = false;
+  bool Unsupported = !Triple.isOSBinFormatELF();
+  if (Triple.isRISCV()) {
+SupportedArgument = V == "desc" || V == "trad";
+EnableTLSDESC = V == "desc";
+  } else 

[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/79633

>From b373a525347ab112ae4b73e747529b48b898d394 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Fri, 26 Jan 2024 09:25:38 -0800
Subject: [PATCH] [Driver,CodeGen] Support -mtls-dialect= (#79256)

GCC supports -mtls-dialect= for several architectures to select TLSDESC.
This patch supports the following values

* x86: "gnu". "gnu2" (TLSDESC) is not supported yet.
* RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915)

AArch64 toolchains seem to support TLSDESC from the beginning, and the
general dynamic model has poor support. Nobody seems to use the option
-mtls-dialect= at all, so we don't bother with it.
There also seems very little interest in AArch32's TLSDESC support.

TLSDESC does not change IR, but affects object file generation. Without
a backend option the option is a no-op for in-process ThinLTO.

There seems no motivation to have fine-grained control mixing trad/desc
for TLS, so we just pass -mllvm, and don't bother with a modules flag
metadata or function attribute.

Co-authored-by: Paul Kirth 
(cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20)
---
 clang/include/clang/Basic/CodeGenOptions.def |  3 ++
 clang/include/clang/Driver/Options.td|  5 
 clang/lib/CodeGen/BackendUtil.cpp|  1 +
 clang/lib/Driver/ToolChains/Clang.cpp|  3 ++
 clang/lib/Driver/ToolChains/CommonArgs.cpp   | 30 
 clang/lib/Driver/ToolChains/CommonArgs.h |  3 ++
 clang/test/CodeGen/RISCV/tls-dialect.c   | 14 +
 clang/test/Driver/tls-dialect.c  | 25 
 llvm/include/llvm/TargetParser/Triple.h  |  6 ++--
 9 files changed, 87 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/RISCV/tls-dialect.c
 create mode 100644 clang/test/Driver/tls-dialect.c

diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 2f2e45d5cf63dfa..7c0bfe328496147 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, 
llvm::driver::VectorLibr
 /// The default TLS model to use.
 ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel)
 
+/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this 
value.
+CODEGENOPT(EnableTLSDESC, 1, 0)
+
 /// Bit size of immediate TLS offsets (0 == use the default).
 VALUE_CODEGENOPT(TLSSize, 8, 0)
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f4fa33748facaf..773bc1dcda01d5c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, 
Group,
   HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): "
"12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 
256TB, needs -mcmodel=large)">,
   MarshallingInfoInt>;
+def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group,
+  Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use 
for dynamic accesses of TLS variables">;
 def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group;
 def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, 
Group;
 def mno_default_build_attributes : Joined<["-"], 
"mno-default-build-attributes">, Group;
@@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], 
"fexperimental-assignme
   Values<"disabled,enabled,forced">, 
NormalizedValues<["Disabled","Enabled","Forced"]>,
   MarshallingInfoEnum, "Enabled">;
 
+def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
+  MarshallingInfoFlag>;
+
 } // let Visibility = [CC1Option]
 
 
//===--===//
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index ec203f6f28bc173..7877e20d77f7724 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.UniqueBasicBlockSectionNames =
   CodeGenOpts.UniqueBasicBlockSectionNames;
   Options.TLSSize = CodeGenOpts.TLSSize;
+  Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC;
   Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;
   Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
   Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection;
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 5dc614e11aab599..8092fc050b0ee6d 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ);
   }
 
+  if (isTLSDESCEnabled(TC, Args))
+CmdAr

[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/79633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ilovepi What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] milestoned 
https://github.com/llvm/llvm-project/pull/79633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79632 (PR #79633)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] created 
https://github.com/llvm/llvm-project/pull/79633

resolves llvm/llvm-project#79632

>From 59c20ec7892a93d96094df91d9aea0c9c5304566 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Fri, 26 Jan 2024 09:25:38 -0800
Subject: [PATCH] [Driver,CodeGen] Support -mtls-dialect= (#79256)

GCC supports -mtls-dialect= for several architectures to select TLSDESC.
This patch supports the following values

* x86: "gnu". "gnu2" (TLSDESC) is not supported yet.
* RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915)

AArch64 toolchains seem to support TLSDESC from the beginning, and the
general dynamic model has poor support. Nobody seems to use the option
-mtls-dialect= at all, so we don't bother with it.
There also seems very little interest in AArch32's TLSDESC support.

TLSDESC does not change IR, but affects object file generation. Without
a backend option the option is a no-op for in-process ThinLTO.

There seems no motivation to have fine-grained control mixing trad/desc
for TLS, so we just pass -mllvm, and don't bother with a modules flag
metadata or function attribute.

Co-authored-by: Paul Kirth 
(cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20)
---
 clang/include/clang/Basic/CodeGenOptions.def |  3 ++
 clang/include/clang/Driver/Options.td|  5 
 clang/lib/CodeGen/BackendUtil.cpp|  1 +
 clang/lib/Driver/ToolChains/Clang.cpp|  3 ++
 clang/lib/Driver/ToolChains/CommonArgs.cpp   | 30 
 clang/lib/Driver/ToolChains/CommonArgs.h |  3 ++
 clang/test/CodeGen/RISCV/tls-dialect.c   | 14 +
 clang/test/Driver/tls-dialect.c  | 25 
 llvm/include/llvm/TargetParser/Triple.h  |  6 ++--
 9 files changed, 87 insertions(+), 3 deletions(-)
 create mode 100644 clang/test/CodeGen/RISCV/tls-dialect.c
 create mode 100644 clang/test/Driver/tls-dialect.c

diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 2f2e45d5cf63df..7c0bfe32849614 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, 
llvm::driver::VectorLibr
 /// The default TLS model to use.
 ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel)
 
+/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this 
value.
+CODEGENOPT(EnableTLSDESC, 1, 0)
+
 /// Bit size of immediate TLS offsets (0 == use the default).
 VALUE_CODEGENOPT(TLSSize, 8, 0)
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f4fa33748faca..773bc1dcda01d5 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, 
Group,
   HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): "
"12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 
256TB, needs -mcmodel=large)">,
   MarshallingInfoInt>;
+def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group,
+  Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use 
for dynamic accesses of TLS variables">;
 def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group;
 def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, 
Group;
 def mno_default_build_attributes : Joined<["-"], 
"mno-default-build-attributes">, Group;
@@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], 
"fexperimental-assignme
   Values<"disabled,enabled,forced">, 
NormalizedValues<["Disabled","Enabled","Forced"]>,
   MarshallingInfoEnum, "Enabled">;
 
+def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
+  MarshallingInfoFlag>;
+
 } // let Visibility = [CC1Option]
 
 
//===--===//
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index ec203f6f28bc17..7877e20d77f772 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.UniqueBasicBlockSectionNames =
   CodeGenOpts.UniqueBasicBlockSectionNames;
   Options.TLSSize = CodeGenOpts.TLSSize;
+  Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC;
   Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;
   Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
   Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection;
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 5dc614e11aab59..8092fc050b0ee6 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ);
   }
 
+  if (

[llvm-branch-commits] [openmp] [clang] [lldb] [mlir] [libcxx] [clang-tools-extra] [libc] [flang] [llvm] [BOLT] Write and parse BF/BB hashes in BAT (PR #76907)

2024-01-26 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov converted_to_draft 
https://github.com/llvm/llvm-project/pull/76907
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [clang] [lldb] [mlir] [llvm] [libc] [flang] [clang-tools-extra] [openmp] [BOLT] Write and parse BF/BB hashes in BAT (PR #76907)

2024-01-26 Thread Amir Ayupov via llvm-branch-commits


@@ -36,9 +37,12 @@ void BoltAddressTranslation::writeEntriesForBB(MapTy &Map,
   if (BBInputOffset == BinaryBasicBlock::INVALID_OFFSET)
 return;
 
-  LLVM_DEBUG(dbgs() << "BB " << BB.getName() << "\n");
-  LLVM_DEBUG(dbgs() << "  Key: " << Twine::utohexstr(BBOutputOffset)
-<< " Val: " << Twine::utohexstr(BBInputOffset) << "\n");
+  LLVM_DEBUG(dbgs() << "BB " << BB.getName() << "\n"

aaupov wrote:

Not in this diff. Hashes will be output to YAML in follow up diffs starting 
with https://github.com/llvm/llvm-project/pull/76910.

https://github.com/llvm/llvm-project/pull/76907
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [clang] Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" in release/18.x (PR #79400)

2024-01-26 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic milestoned 
https://github.com/llvm/llvm-project/pull/79400
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79420 (PR #79595)

2024-01-26 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/79595
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-26 Thread Christian Ulmann via llvm-branch-commits

Dinistro wrote:

Here we have to wait for the build bots, as they are mandatory. 

https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-26 Thread Andrei Golubev via llvm-branch-commits

andrey-golubev wrote:

LGTM. But surely we need someone with commit access (sigh). CC @Dinistro 
@zero9178

https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] milestoned 
https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-26 Thread via llvm-branch-commits

github-actions[bot] wrote:

@andrey-golubev What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79603
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] created 
https://github.com/llvm/llvm-project/pull/79603

resolves llvm/llvm-project#79600

>From 7732d51a75a991f3e49d22e2ea6478c40025944d Mon Sep 17 00:00:00 2001
From: Andrei Golubev 
Date: Fri, 26 Jan 2024 15:27:51 +0200
Subject: [PATCH] [mlir][LLVM] Use int32_t to indirectly construct GEPArg
 (#79562)

GEPArg can only be constructed from int32_t and mlir::Value. Explicitly
cast other types (e.g. unsigned, size_t) to int32_t to avoid narrowing
conversion warnings on MSVC. Some recent examples of such are:

```
mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398:
Element '1': conversion from 'size_t' to 'T' requires a narrowing
conversion
with
[
T=mlir::LLVM::GEPArg
]

mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398:
Element '1': conversion from 'unsigned int' to 'T' requires a narrowing
conversion
with
[
T=mlir::LLVM::GEPArg
]
```

Co-authored-by: Nikita Kudriavtsev 
(cherry picked from commit 89cd345667a5f8f4c37c621fd8abe8d84e85c050)
---
 mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp   | 3 ++-
 mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp  | 9 +
 mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp | 9 +
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp 
b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
index ae2bd8e5b5405d9..73d418cb8413276 100644
--- a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
+++ b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp
@@ -529,7 +529,8 @@ LogicalResult GPUPrintfOpToVPrintfLowering::matchAndRewrite(
   /*alignment=*/0);
   for (auto [index, arg] : llvm::enumerate(args)) {
 Value ptr = rewriter.create(
-loc, ptrType, structType, tempAlloc, ArrayRef{0, index});
+loc, ptrType, structType, tempAlloc,
+ArrayRef{0, static_cast(index)});
 rewriter.create(loc, arg, ptr);
   }
   std::array printfArgs = {stringStart, tempAlloc};
diff --git a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp 
b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
index f853d5c47b623cf..78d4e8062468720 100644
--- a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
+++ b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp
@@ -1041,13 +1041,14 @@ Value 
ConvertLaunchFuncOpToGpuRuntimeCallPattern::generateParamsArray(
   auto arrayPtr = builder.create(
   loc, llvmPointerType, llvmPointerType, arraySize, /*alignment=*/0);
   for (const auto &en : llvm::enumerate(arguments)) {
+const auto index = static_cast(en.index());
 Value fieldPtr =
 builder.create(loc, llvmPointerType, structType, 
structPtr,
-ArrayRef{0, en.index()});
+ArrayRef{0, index});
 builder.create(loc, en.value(), fieldPtr);
-auto elementPtr = builder.create(
-loc, llvmPointerType, llvmPointerType, arrayPtr,
-ArrayRef{en.index()});
+auto elementPtr =
+builder.create(loc, llvmPointerType, llvmPointerType,
+arrayPtr, ArrayRef{index});
 builder.create(loc, fieldPtr, elementPtr);
   }
   return arrayPtr;
diff --git a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp 
b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp
index 72f9295749a66ba..b25c831bc7172a3 100644
--- a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp
+++ b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp
@@ -488,7 +488,8 @@ static void splitVectorStore(const DataLayout &dataLayout, 
Location loc,
 // Other patterns will turn this into a type-consistent GEP.
 auto gepOp = rewriter.create(
 loc, address.getType(), rewriter.getI8Type(), address,
-ArrayRef{storeOffset + index * elementSize});
+ArrayRef{
+static_cast(storeOffset + index * elementSize)});
 
 rewriter.create(loc, extractOp, gepOp);
   }
@@ -524,9 +525,9 @@ static void splitIntegerStore(const DataLayout &dataLayout, 
Location loc,
 
 // We create an `i8` indexed GEP here as that is the easiest (offset is
 // already known). Other patterns turn this into a type-consistent GEP.
-auto gepOp =
-rewriter.create(loc, address.getType(), rewriter.getI8Type(),
-   address, ArrayRef{currentOffset});
+auto gepOp = rewriter.create(
+loc, address.getType(), rewriter.getI8Type(), address,
+ArrayRef{static_cast(currentOffset)});
 rewriter.create(loc, valueToStore, gepOp);
 
 // No need to care about padding here since we already checked previously

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [libcxx] Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" in release/18.x (PR #79400)

2024-01-26 Thread Erich Keane via llvm-branch-commits

https://github.com/erichkeane approved this pull request.


https://github.com/llvm/llvm-project/pull/79400
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [libcxx] Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" in release/18.x (PR #79400)

2024-01-26 Thread Erich Keane via llvm-branch-commits

erichkeane wrote:

Author seems to have disappeared, so approving.

https://github.com/llvm/llvm-project/pull/79400
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79596)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Nikita Popov (nikic)


Changes

Resolves #79479.

---

Patch is 91.93 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79596.diff


20 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+2) 
- (modified) clang/include/clang/AST/Type.h (+3) 
- (modified) clang/include/clang/Basic/AttrDocs.td (+4-1) 
- (modified) clang/lib/AST/ASTContext.cpp (+16-4) 
- (modified) clang/lib/AST/ItaniumMangle.cpp (+17-8) 
- (modified) clang/lib/AST/JSONNodeDumper.cpp (+3) 
- (modified) clang/lib/AST/TextNodeDumper.cpp (+3) 
- (modified) clang/lib/AST/Type.cpp (+14-1) 
- (modified) clang/lib/AST/TypePrinter.cpp (+2) 
- (modified) clang/lib/CodeGen/Targets/RISCV.cpp (+15-6) 
- (modified) clang/lib/Sema/SemaExpr.cpp (+4-2) 
- (modified) clang/lib/Sema/SemaType.cpp (+15-6) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-bitcast.c (+100) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-call.c (+74) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-cast.c (+72-4) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-codegen.c (+172) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-globals.c (+107) 
- (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-types.c (+284) 
- (modified) clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp (+72) 
- (modified) clang/test/Sema/attr-riscv-rvv-vector-bits.c (+86-2) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a5e..45d1ab34d0f9311 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1227,6 +1227,8 @@ RISC-V Support
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
 
+- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t 
types.
+
 CUDA/HIP Language Changes
 ^
 
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index ea425791fc97f05..6384cf9420b82e1 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -3495,6 +3495,9 @@ enum class VectorKind {
 
   /// is RISC-V RVV fixed-length data vector
   RVVFixedLengthData,
+
+  /// is RISC-V RVV fixed-length mask vector
+  RVVFixedLengthMask,
 };
 
 /// Represents a GCC generic vector type. This type is created using
diff --git a/clang/include/clang/Basic/AttrDocs.td 
b/clang/include/clang/Basic/AttrDocs.td
index 7e633f8e2635a9a..e02a1201e2ad79a 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536.
 For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the 
LMUL
 of the type before passing to the attribute.
 
-``vbool*_t`` types are not supported at this time.
+For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the
+number from the type name. For example, ``vbool8_t`` needs to use
+``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8,
+the type is not supported for that value of ``__riscv_v_fixed_vlen``.
 }];
 }
 
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 5eb7aa3664569dd..ab16ca10395fa83 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const 
{
 else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate)
   // Adjust the alignment for fixed-length SVE predicates.
   Align = 16;
-else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData)
+else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData ||
+ VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
   // Adjust the alignment for fixed-length RVV vectors.
   Align = std::min(64, Width);
 break;
@@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType 
FirstVec,
   Second->getVectorKind() != VectorKind::SveFixedLengthData &&
   Second->getVectorKind() != VectorKind::SveFixedLengthPredicate &&
   First->getVectorKind() != VectorKind::RVVFixedLengthData &&
-  Second->getVectorKind() != VectorKind::RVVFixedLengthData)
+  Second->getVectorKind() != VectorKind::RVVFixedLengthData &&
+  First->getVectorKind() != VectorKind::RVVFixedLengthMask &&
+  Second->getVectorKind() != VectorKind::RVVFixedLengthMask)
 return true;
 
   return false;
@@ -9522,8 +9525,11 @@ static uint64_t getRVVTypeSize(ASTContext &Context, 
const BuiltinType *Ty) {
 
   ASTContext::BuiltinVectorTypeInfo Info = 
Context.getBuiltinVectorTypeInfo(Ty);
 
-  uint64_t EltSize = Context.getTypeSize(Info.ElementType);
-  uint64_t MinElts = Info.EC.getKnownMinValue();
+  unsigned EltSize = Context.getTypeSize(Info.ElementType);
+  if (Info.ElementType == Context.BoolTy)
+EltSize = 1;
+
+  unsigned MinElts = Info.EC.getKnownMinValue();
   r

[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79596)

2024-01-26 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic created https://github.com/llvm/llvm-project/pull/79596

Resolves #79479.

>From cc135ed1df0d7573b0474d76e1d47236b30cdf36 Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 25 Jan 2024 09:39:29 -0800
Subject: [PATCH] Recommit "[RISCV] Support __riscv_v_fixed_vlen for vbool
 types. (#76551)"

Test updated to expect i8 gep.

Original message:

This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.

(cherry picked from commit c92ad411f2f94d8521cd18abcb37285f9a390ecb)
---
 clang/docs/ReleaseNotes.rst   |   2 +
 clang/include/clang/AST/Type.h|   3 +
 clang/include/clang/Basic/AttrDocs.td |   5 +-
 clang/lib/AST/ASTContext.cpp  |  20 +-
 clang/lib/AST/ItaniumMangle.cpp   |  25 +-
 clang/lib/AST/JSONNodeDumper.cpp  |   3 +
 clang/lib/AST/TextNodeDumper.cpp  |   3 +
 clang/lib/AST/Type.cpp|  15 +-
 clang/lib/AST/TypePrinter.cpp |   2 +
 clang/lib/CodeGen/Targets/RISCV.cpp   |  21 +-
 clang/lib/Sema/SemaExpr.cpp   |   6 +-
 clang/lib/Sema/SemaType.cpp   |  21 +-
 .../attr-riscv-rvv-vector-bits-bitcast.c  | 100 ++
 .../CodeGen/attr-riscv-rvv-vector-bits-call.c |  74 +
 .../CodeGen/attr-riscv-rvv-vector-bits-cast.c |  76 -
 .../attr-riscv-rvv-vector-bits-codegen.c  | 172 +++
 .../attr-riscv-rvv-vector-bits-globals.c  | 107 +++
 .../attr-riscv-rvv-vector-bits-types.c| 284 ++
 .../riscv-mangle-rvv-fixed-vectors.cpp|  72 +
 clang/test/Sema/attr-riscv-rvv-vector-bits.c  |  88 +-
 20 files changed, 1065 insertions(+), 34 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a5e..45d1ab34d0f9311 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1227,6 +1227,8 @@ RISC-V Support
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
 
+- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t 
types.
+
 CUDA/HIP Language Changes
 ^
 
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index ea425791fc97f05..6384cf9420b82e1 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -3495,6 +3495,9 @@ enum class VectorKind {
 
   /// is RISC-V RVV fixed-length data vector
   RVVFixedLengthData,
+
+  /// is RISC-V RVV fixed-length mask vector
+  RVVFixedLengthMask,
 };
 
 /// Represents a GCC generic vector type. This type is created using
diff --git a/clang/include/clang/Basic/AttrDocs.td 
b/clang/include/clang/Basic/AttrDocs.td
index 7e633f8e2635a9a..e02a1201e2ad79a 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536.
 For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the 
LMUL
 of the type before passing to the attribute.
 
-``vbool*_t`` types are not supported at this time.
+For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the
+number from the type name. For example, ``vbool8_t`` needs to use
+``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8,
+the type is not supported for that value of ``__riscv_v_fixed_vlen``.
 }];
 }
 
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 5eb7aa3664569dd..ab16ca10395fa83 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const 
{
 else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate)
   // Adjust the alignment for fixed-length SVE predicates.
   Align = 16;
-else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData)
+else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData ||
+ VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
   // Adjust the alignment for fixed-length RVV vectors.
   Align = std::min(64, Width);
 break;
@@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType 
FirstVec,
   Second->getVectorKind() != VectorKind::SveFixedLengthData &&
   Second->getVectorKind() != VectorKind::SveFixedLengthPredicate &&
   First->getVectorKind() != VectorKind::RVVFixedLengthData &&
-  Second->getVectorKind() != VectorKind::RVVFixedLengthData)
+  Second->getVectorKind() != VectorKind::RVVFixedLengthData &&
+  First->getVectorKind() != VectorKind::RVVFixedLengthMask &&
+  Second->getVectorKind() != VectorKind::RVVFix

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79420 (PR #79595)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-selectiondag

Author: Nikita Popov (nikic)


Changes

Resolves #79420.

---
Full diff: https://github.com/llvm/llvm-project/pull/79595.diff


2 Files Affected:

- (modified) llvm/test/TableGen/address-space-patfrags.td (+2-2) 
- (modified) llvm/utils/TableGen/DAGISelMatcherEmitter.cpp (+2-1) 


``diff
diff --git a/llvm/test/TableGen/address-space-patfrags.td 
b/llvm/test/TableGen/address-space-patfrags.td
index 4aec6ea7e0eae86..46050a70720fbe1 100644
--- a/llvm/test/TableGen/address-space-patfrags.td
+++ b/llvm/test/TableGen/address-space-patfrags.td
@@ -46,7 +46,7 @@ def inst_d : Instruction {
   let InOperandList = (ins GPR32:$src0, GPR32:$src1);
 }
 
-// SDAG: case 1: {
+// SDAG: case 0: {
 // SDAG-NEXT: // Predicate_pat_frag_b
 // SDAG-NEXT: // Predicate_truncstorei16_addrspace
 // SDAG-NEXT: SDNode *N = Node;
@@ -69,7 +69,7 @@ def : Pat <
 >;
 
 
-// SDAG: case 6: {
+// SDAG: case 4: {
 // SDAG: // Predicate_pat_frag_a
 // SDAG-NEXT: SDNode *N = Node;
 // SDAG-NEXT: (void)N;
diff --git a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp 
b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
index 455183987b7b27b..50156d34528c153 100644
--- a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+++ b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
@@ -57,7 +57,8 @@ class MatcherTableEmitter {
 
   // We de-duplicate the predicates by code string, and use this map to track
   // all the patterns with "identical" predicates.
-  StringMap> NodePredicatesByCodeToRun;
+  MapVector, StringMap>
+  NodePredicatesByCodeToRun;
 
   std::vector PatternPredicates;
 

``




https://github.com/llvm/llvm-project/pull/79595
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79420 (PR #79595)

2024-01-26 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic created https://github.com/llvm/llvm-project/pull/79595

Resolves #79420.

>From fb66a8484904a1e585c0e54553c1c8b5e5d13dd2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Thu, 25 Jan 2024 16:16:19 +0800
Subject: [PATCH] [TableGen] Use MapVector to remove non-determinism

This fixes found non-determinism when `LLVM_REVERSE_ITERATION`
option is `ON`.

Fixes #79420.

Reviewers: ilovepi, MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/79411

(cherry picked from commit 41fe98a6e7e5cdcab4a4e9e0d09339231f480c01)
---
 llvm/test/TableGen/address-space-patfrags.td  | 4 ++--
 llvm/utils/TableGen/DAGISelMatcherEmitter.cpp | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/llvm/test/TableGen/address-space-patfrags.td 
b/llvm/test/TableGen/address-space-patfrags.td
index 4aec6ea7e0eae86..46050a70720fbe1 100644
--- a/llvm/test/TableGen/address-space-patfrags.td
+++ b/llvm/test/TableGen/address-space-patfrags.td
@@ -46,7 +46,7 @@ def inst_d : Instruction {
   let InOperandList = (ins GPR32:$src0, GPR32:$src1);
 }
 
-// SDAG: case 1: {
+// SDAG: case 0: {
 // SDAG-NEXT: // Predicate_pat_frag_b
 // SDAG-NEXT: // Predicate_truncstorei16_addrspace
 // SDAG-NEXT: SDNode *N = Node;
@@ -69,7 +69,7 @@ def : Pat <
 >;
 
 
-// SDAG: case 6: {
+// SDAG: case 4: {
 // SDAG: // Predicate_pat_frag_a
 // SDAG-NEXT: SDNode *N = Node;
 // SDAG-NEXT: (void)N;
diff --git a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp 
b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
index 455183987b7b27b..50156d34528c153 100644
--- a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+++ b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
@@ -57,7 +57,8 @@ class MatcherTableEmitter {
 
   // We de-duplicate the predicates by code string, and use this map to track
   // all the patterns with "identical" predicates.
-  StringMap> NodePredicatesByCodeToRun;
+  MapVector, StringMap>
+  NodePredicatesByCodeToRun;
 
   std::vector PatternPredicates;
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [mlir] [llvm] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/79461
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] created 
https://github.com/llvm/llvm-project/pull/79572

resolves llvm/llvm-project#79571

>From 14afe95f8f9bdfb5474445e5930e38afe1ba9f60 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Wed, 24 Jan 2024 10:15:42 +0100
Subject: [PATCH] [MSSAUpdater] Handle simplified accesses when updating phis
 (#78272)

This is a followup to #76819. After those changes, we can still run into
an assertion failure for a slight variation of the test case: When
fixing up MemoryPhis, we map the incoming access to the access of the
cloned instruction -- which may now no longer exist.

Fix this by reusing the getNewDefiningAccessForClone() helper, which
will look upwards for a new defining access in that case.

(cherry picked from commit a7a1b8b17e264fb0f2d2b4165cf9a7f5094b08b3)
---
 llvm/lib/Analysis/MemorySSAUpdater.cpp|  22 +---
 .../memssa-readnone-access.ll | 104 ++
 2 files changed, 107 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Analysis/MemorySSAUpdater.cpp 
b/llvm/lib/Analysis/MemorySSAUpdater.cpp
index e87ae7d71fffe20..aa550f0b6a7bfd6 100644
--- a/llvm/lib/Analysis/MemorySSAUpdater.cpp
+++ b/llvm/lib/Analysis/MemorySSAUpdater.cpp
@@ -692,25 +692,9 @@ void MemorySSAUpdater::updateForClonedLoop(const 
LoopBlocksRPO &LoopBlocks,
 continue;
 
   // Determine incoming value and add it as incoming from IncBB.
-  if (MemoryUseOrDef *IncMUD = dyn_cast(IncomingAccess)) {
-if (!MSSA->isLiveOnEntryDef(IncMUD)) {
-  Instruction *IncI = IncMUD->getMemoryInst();
-  assert(IncI && "Found MemoryUseOrDef with no Instruction.");
-  if (Instruction *NewIncI =
-  cast_or_null(VMap.lookup(IncI))) {
-IncMUD = MSSA->getMemoryAccess(NewIncI);
-assert(IncMUD &&
-   "MemoryUseOrDef cannot be null, all preds processed.");
-  }
-}
-NewPhi->addIncoming(IncMUD, IncBB);
-  } else {
-MemoryPhi *IncPhi = cast(IncomingAccess);
-if (MemoryAccess *NewDefPhi = MPhiMap.lookup(IncPhi))
-  NewPhi->addIncoming(NewDefPhi, IncBB);
-else
-  NewPhi->addIncoming(IncPhi, IncBB);
-  }
+  NewPhi->addIncoming(
+  getNewDefiningAccessForClone(IncomingAccess, VMap, MPhiMap, MSSA),
+  IncBB);
 }
 if (auto *SingleAccess = onlySingleValue(NewPhi)) {
   MPhiMap[Phi] = SingleAccess;
diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll 
b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
index 2aaf777683e116f..c6e6608d4be383a 100644
--- a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
@@ -115,3 +115,107 @@ split:
 exit:
   ret void
 }
+
+; Variants of the above test with swapped branch destinations.
+
+define void @test1_swapped(i1 %c) {
+; CHECK-LABEL: define void @test1_swapped(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  start:
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[START_SPLIT_US:%.*]], label 
[[START_SPLIT:%.*]]
+; CHECK:   start.split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:br label [[LOOP_US]]
+; CHECK:   start.split:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:br label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+start:
+  br label %loop
+
+loop:
+  %fn = load ptr, ptr @vtable, align 8
+  call void %fn()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test2_swapped(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test2_swapped(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label 
[[DOTSPLIT:%.*]]
+; CHECK:   .split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:call void @bar()
+; CHECK-NEXT:br label [[LOOP_US]]
+; CHECK:   .split:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:call void @bar()
+; CHECK-NEXT:br label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+  br label %loop
+
+loop:
+  %fn = load ptr, ptr @vtable, align 8
+  call void %fn()
+  call void @bar()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test3_swapped(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test3_swapped(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label 
[[DOTSPLIT:%.*]]
+; CHECK:   .split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)

2024-01-26 Thread via llvm-branch-commits

github-actions[bot] wrote:

@alinas What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] milestoned 
https://github.com/llvm/llvm-project/pull/79572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79293

---

Patch is 1006.42 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79461.diff


65 Files Affected:

- (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl 
(+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) 
- (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) 
- (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) 
- (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) 
- (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) 
- (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) 
- (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) 
- (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) 
- (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll 
(+117-18) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+504) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll 
(+309) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+459) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll 
(+274) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+499) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+456) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) 
- (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) 
- (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) 


``diff
diff --git a/clang/includ

[llvm-branch-commits] [llvm] [mlir] [clang] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79293

---

Patch is 1006.55 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79461.diff


65 Files Affected:

- (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl 
(+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) 
- (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) 
- (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) 
- (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) 
- (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) 
- (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) 
- (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) 
- (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) 
- (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll 
(+117-18) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+504) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll 
(+309) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+459) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll 
(+274) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+499) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+456) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) 
- (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) 
- (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) 



[llvm-branch-commits] [llvm] [mlir] [clang] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mc

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79293

---

Patch is 1006.55 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79461.diff


65 Files Affected:

- (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl 
(+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) 
- (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) 
- (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) 
- (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) 
- (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) 
- (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) 
- (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) 
- (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) 
- (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll 
(+117-18) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+504) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll 
(+309) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+459) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll 
(+274) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+499) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+456) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) 
- (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) 
- (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) 


``diff
diff --git a/clang/include/clang

[llvm-branch-commits] [mlir] [llvm] [clang] PR for llvm/llvm-project#79293 (PR #79461)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-globalisel

Author: None (github-actions[bot])


Changes

resolves llvm/llvm-project#79293

---

Patch is 1006.42 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79461.diff


65 Files Affected:

- (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) 
- (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl 
(+1-1) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) 
- (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) 
- (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) 
- (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) 
- (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) 
- (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) 
- (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) 
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) 
- (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) 
- (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) 
- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) 
- (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) 
- (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) 
- (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll 
(+117-18) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+504) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll 
(+309) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+459) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll 
(+274) 
- (added) 
llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll 
(+499) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll 
(+456) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) 
- (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) 
- (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) 
- (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) 
- (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) 
- (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) 


``diff
diff --git a/clang/

[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)

2024-01-26 Thread David Sherwood via llvm-branch-commits

https://github.com/david-arm approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/79566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)

2024-01-26 Thread via llvm-branch-commits

github-actions[bot] wrote:

@david-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] milestoned 
https://github.com/llvm/llvm-project/pull/79566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] created 
https://github.com/llvm/llvm-project/pull/79566

resolves llvm/llvm-project#79564

>From 27109f3d576c946cde0162ee29f251d2ab2d0ed2 Mon Sep 17 00:00:00 2001
From: Paschalis Mpeis 
Date: Thu, 25 Jan 2024 09:29:46 +
Subject: [PATCH] [LTO] Fix Veclib flags correctly pass to LTO flags (#78749)

Flags `-fveclib=name` were not passed to LTO flags.
This pass fixes that by converting the `-fveclib` flags to their
relevant names for opt's `-vector-lib=name` flags.

For example:
`-fveclib=SLEEF` would become `-vector-library=sleefgnuabi` and passed
through the `-plugin-opt` flag.

(cherry picked from commit 03cf0e9354e7e56ff794e9efb682ed2971bc91ec)
---
 clang/lib/Driver/ToolChains/CommonArgs.cpp | 22 ++
 clang/test/Driver/fveclib.c| 18 ++
 2 files changed, 40 insertions(+)

diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index fadaf3e60c6616a..9f1dddc47e3e053 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -783,6 +783,28 @@ void tools::addLTOOptions(const ToolChain &ToolChain, 
const ArgList &Args,
  "-generate-arange-section"));
   }
 
+  // Pass vector library arguments to LTO.
+  Arg *ArgVecLib = Args.getLastArg(options::OPT_fveclib);
+  if (ArgVecLib && ArgVecLib->getNumValues() == 1) {
+// Map the vector library names from clang front-end to opt front-end. The
+// values are taken from the TargetLibraryInfo class command line options.
+std::optional OptVal =
+llvm::StringSwitch>(ArgVecLib->getValue())
+.Case("Accelerate", "Accelerate")
+.Case("LIBMVEC", "LIBMVEC-X86")
+.Case("MASSV", "MASSV")
+.Case("SVML", "SVML")
+.Case("SLEEF", "sleefgnuabi")
+.Case("Darwin_libsystem_m", "Darwin_libsystem_m")
+.Case("ArmPL", "ArmPL")
+.Case("none", "none")
+.Default(std::nullopt);
+
+if (OptVal)
+  CmdArgs.push_back(Args.MakeArgString(
+  Twine(PluginOptPrefix) + "-vector-library=" + OptVal.value()));
+  }
+
   // Try to pass driver level flags relevant to LTO code generation down to
   // the plugin.
 
diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c
index e2a7619e9b89f7f..8a230284bcdfe4f 100644
--- a/clang/test/Driver/fveclib.c
+++ b/clang/test/Driver/fveclib.c
@@ -31,3 +31,21 @@
 
 // RUN: %clang -fveclib=Accelerate %s -nodefaultlibs -target 
arm64-apple-ios8.0.0 -### 2>&1 | FileCheck 
--check-prefix=CHECK-LINK-NODEFAULTLIBS %s
 // CHECK-LINK-NODEFAULTLIBS-NOT: "-framework" "Accelerate"
+
+
+/* Verify that the correct vector library is passed to LTO flags. */
+
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto 
%s 2>&1 | FileCheck -check-prefix CHECK-LTO-LIBMVEC %s
+// CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86"
+
+// RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto 
%s 2>&1 | FileCheck -check-prefix CHECK-LTO-MASSV %s
+// CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV"
+
+// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 
2>&1 | FileCheck -check-prefix CHECK-LTO-SVML %s
+// CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML"
+
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | 
FileCheck -check-prefix CHECK-LTO-SLEEF %s
+// CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi"
+
+// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | 
FileCheck -check-prefix CHECK-LTO-ARMPL %s
+// CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)

2024-01-26 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp approved this pull request.

LGTM.
(Is this the right approach in current workflow?)

https://github.com/llvm/llvm-project/pull/79560
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79137 (PR #79561)

2024-01-26 Thread via llvm-branch-commits

github-actions[bot] wrote:

@fhahn What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/79561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79137 (PR #79561)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] milestoned 
https://github.com/llvm/llvm-project/pull/79561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79137 (PR #79561)

2024-01-26 Thread via llvm-branch-commits

https://github.com/github-actions[bot] created 
https://github.com/llvm/llvm-project/pull/79561

resolves llvm/llvm-project#79137

>From e2490885d05f95af41fda6b9f3adb20826e80483 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Wed, 24 Jan 2024 10:45:20 +0100
Subject: [PATCH 1/2] [PhaseOrdering] Add additional test for #79161 (NFC)

(cherry picked from commit 543cf08636f3a3bb55dddba2e8cad787601647ba)
---
 .../X86/loop-vectorizer-noalias.ll| 147 ++
 1 file changed, 147 insertions(+)
 create mode 100644 
llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll

diff --git a/llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll 
b/llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll
new file mode 100644
index 00..846787f721ba7e
--- /dev/null
+++ b/llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll
@@ -0,0 +1,147 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -O3 -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s
+
+define internal void @acc(ptr noalias noundef %val, ptr noalias noundef %prev) 
{
+entry:
+  %0 = load i8, ptr %prev, align 1
+  %conv = zext i8 %0 to i32
+  %1 = load i8, ptr %val, align 1
+  %conv1 = zext i8 %1 to i32
+  %add = add nsw i32 %conv1, %conv
+  %conv2 = trunc i32 %add to i8
+  store i8 %conv2, ptr %val, align 1
+  ret void
+}
+
+; This loop should not get vectorized.
+; FIXME: This is a miscompile.
+define void @accsum(ptr noundef %vals, i64 noundef %num) #0 {
+; CHECK-LABEL: define void @accsum(
+; CHECK-SAME: ptr nocapture noundef [[VALS:%.*]], i64 noundef [[NUM:%.*]]) 
local_unnamed_addr #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[CMP1:%.*]] = icmp ugt i64 [[NUM]], 1
+; CHECK-NEXT:br i1 [[CMP1]], label [[ITER_CHECK:%.*]], label 
[[FOR_END:%.*]]
+; CHECK:   iter.check:
+; CHECK-NEXT:[[TMP0:%.*]] = add i64 [[NUM]], -1
+; CHECK-NEXT:[[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[NUM]], 9
+; CHECK-NEXT:br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER:%.*]], 
label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]]
+; CHECK:   vector.main.loop.iter.check:
+; CHECK-NEXT:[[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[NUM]], 33
+; CHECK-NEXT:br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.*]], 
label [[VECTOR_PH:%.*]]
+; CHECK:   vector.ph:
+; CHECK-NEXT:[[N_VEC:%.*]] = and i64 [[TMP0]], -32
+; CHECK-NEXT:br label [[VECTOR_BODY:%.*]]
+; CHECK:   vector.body:
+; CHECK-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ 
[[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
+; CHECK-NEXT:[[OFFSET_IDX:%.*]] = or disjoint i64 [[INDEX]], 1
+; CHECK-NEXT:[[TMP1:%.*]] = getelementptr inbounds i8, ptr [[VALS]], i64 
[[OFFSET_IDX]]
+; CHECK-NEXT:[[TMP2:%.*]] = getelementptr i8, ptr [[TMP1]], i64 -1
+; CHECK-NEXT:tail call void @llvm.experimental.noalias.scope.decl(metadata 
[[META0:![0-9]+]])
+; CHECK-NEXT:tail call void @llvm.experimental.noalias.scope.decl(metadata 
[[META3:![0-9]+]])
+; CHECK-NEXT:[[TMP3:%.*]] = getelementptr i8, ptr [[TMP1]], i64 15
+; CHECK-NEXT:[[WIDE_LOAD:%.*]] = load <16 x i8>, ptr [[TMP2]], align 1, 
!alias.scope [[META3]], !noalias [[META0]]
+; CHECK-NEXT:[[WIDE_LOAD4:%.*]] = load <16 x i8>, ptr [[TMP3]], align 1, 
!alias.scope [[META3]], !noalias [[META0]]
+; CHECK-NEXT:[[TMP4:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i64 16
+; CHECK-NEXT:[[WIDE_LOAD5:%.*]] = load <16 x i8>, ptr [[TMP1]], align 1, 
!alias.scope [[META0]], !noalias [[META3]]
+; CHECK-NEXT:[[WIDE_LOAD6:%.*]] = load <16 x i8>, ptr [[TMP4]], align 1, 
!alias.scope [[META0]], !noalias [[META3]]
+; CHECK-NEXT:[[TMP5:%.*]] = add <16 x i8> [[WIDE_LOAD5]], [[WIDE_LOAD]]
+; CHECK-NEXT:[[TMP6:%.*]] = add <16 x i8> [[WIDE_LOAD6]], [[WIDE_LOAD4]]
+; CHECK-NEXT:store <16 x i8> [[TMP5]], ptr [[TMP1]], align 1, !alias.scope 
[[META0]], !noalias [[META3]]
+; CHECK-NEXT:store <16 x i8> [[TMP6]], ptr [[TMP4]], align 1, !alias.scope 
[[META0]], !noalias [[META3]]
+; CHECK-NEXT:[[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32
+; CHECK-NEXT:[[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; CHECK-NEXT:br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label 
[[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK:   middle.block:
+; CHECK-NEXT:[[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; CHECK-NEXT:br i1 [[CMP_N]], label [[FOR_END]], label 
[[VEC_EPILOG_ITER_CHECK:%.*]]
+; CHECK:   vec.epilog.iter.check:
+; CHECK-NEXT:[[IND_END9:%.*]] = or disjoint i64 [[N_VEC]], 1
+; CHECK-NEXT:[[N_VEC_REMAINING:%.*]] = and i64 [[TMP0]], 24
+; CHECK-NEXT:[[MIN_EPILOG_ITERS_CHECK:%.*]] = icmp eq i64 
[[N_VEC_REMAINING]], 0
+; CHECK-NEXT:br i1 [[MIN_EPILOG_ITERS_CHECK]], label 
[[FOR_BODY_PREHEADER]], label [[VEC_EPILOG_PH]]
+; CHECK:   vec.epilog.ph:
+; CHECK-NEXT:[[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], 
[[VEC_EPILOG_

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)

2024-01-26 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: Nikita Popov (nikic)


Changes

Resolves #79425.

---

Patch is 24.34 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/79560.diff


12 Files Affected:

- (modified) llvm/include/llvm/Target/TargetInstrPredicate.td (+34) 
- (modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCV.td (+6) 
- (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (-24) 
- (removed) llvm/lib/Target/RISCV/RISCVMacroFusion.cpp (-210) 
- (removed) llvm/lib/Target/RISCV/RISCVMacroFusion.h (-28) 
- (added) llvm/lib/Target/RISCV/RISCVMacroFusion.td (+93) 
- (modified) llvm/lib/Target/RISCV/RISCVSubtarget.cpp (+6-2) 
- (modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+3-5) 
- (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+8-5) 
- (modified) llvm/utils/TableGen/PredicateExpander.cpp (+34) 
- (modified) llvm/utils/TableGen/PredicateExpander.h (+4) 


``diff
diff --git a/llvm/include/llvm/Target/TargetInstrPredicate.td 
b/llvm/include/llvm/Target/TargetInstrPredicate.td
index 82c4c7b23a49b6a..b5419cb9f3867f0 100644
--- a/llvm/include/llvm/Target/TargetInstrPredicate.td
+++ b/llvm/include/llvm/Target/TargetInstrPredicate.td
@@ -152,6 +152,34 @@ class CheckImmOperand_s : 
CheckOperandBase {
   string ImmVal = Value;
 }
 
+// Check that the operand at position `Index` is less than `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandLT : CheckOperandBase {
+  int ImmVal = Imm;
+}
+
+// Check that the operand at position `Index` is greater than `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandGT : CheckOperandBase {
+  int ImmVal = Imm;
+}
+
+// Check that the operand at position `Index` is less than or equal to `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandLE : 
CheckNot>;
+
+// Check that the operand at position `Index` is greater than or equal to 
`Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandGE : 
CheckNot>;
+
 // Expands to a call to `FunctionMapper` if field `FunctionMapper` is set.
 // Otherwise, it expands to a CheckNot>.
 class CheckRegOperandSimple : CheckOperandBase;
@@ -203,6 +231,12 @@ class CheckAll Sequence>
 class CheckAny Sequence>
 : CheckPredicateSequence;
 
+// Check that the operand at position `Index` is in range [Start, End].
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against range [Start, End].
+class CheckImmOperandRange
+  : CheckAll<[CheckImmOperandGE, CheckImmOperandLE]>;
 
 // Used to expand the body of a function predicate. See the definition of
 // TIIPredicate below.
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt 
b/llvm/lib/Target/RISCV/CMakeLists.txt
index a0c3345ec1bbd7e..ac88cd49db4e4ba 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -5,6 +5,7 @@ set(LLVM_TARGET_DEFINITIONS RISCV.td)
 tablegen(LLVM RISCVGenAsmMatcher.inc -gen-asm-matcher)
 tablegen(LLVM RISCVGenAsmWriter.inc -gen-asm-writer)
 tablegen(LLVM RISCVGenCompressInstEmitter.inc -gen-compress-inst-emitter)
+tablegen(LLVM RISCVGenMacroFusion.inc -gen-macro-fusion-pred)
 tablegen(LLVM RISCVGenDAGISel.inc -gen-dag-isel)
 tablegen(LLVM RISCVGenDisassemblerTables.inc -gen-disassembler)
 tablegen(LLVM RISCVGenInstrInfo.inc -gen-instr-info)
@@ -43,7 +44,6 @@ add_llvm_target(RISCVCodeGen
   RISCVISelDAGToDAG.cpp
   RISCVISelLowering.cpp
   RISCVMachineFunctionInfo.cpp
-  RISCVMacroFusion.cpp
   RISCVMergeBaseOffset.cpp
   RISCVOptWInstrs.cpp
   RISCVPostRAExpandPseudoInsts.cpp
diff --git a/llvm/lib/Target/RISCV/RISCV.td b/llvm/lib/Target/RISCV/RISCV.td
index e6e879282241dde..27d52c16a4f39d9 100644
--- a/llvm/lib/Target/RISCV/RISCV.td
+++ b/llvm/lib/Target/RISCV/RISCV.td
@@ -30,6 +30,12 @@ include "RISCVCallingConv.td"
 include "RISCVInstrInfo.td"
 include "GISel/RISCVRegisterBanks.td"
 
+//===--===//
+// RISC-V macro fusions.
+//===--===//
+
+include "RISCVMacroFusion.td"
+
 
//===--===//
 // RISC-V Scheduling Models
 
//===--==

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)

2024-01-26 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic created https://github.com/llvm/llvm-project/pull/79560

Resolves #79425.

>From 2c3d2c996c0b1fb929d903768612c47393528cd3 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Thu, 25 Jan 2024 15:17:31 +0800
Subject: [PATCH 1/2] [TableGen] Add predicates for immediates comparison
 (#76004)

These predicates can be used to represent `<`, `<=`, `>`, `>=`.

And a predicate for `in range` is added.

(cherry picked from commit 664a0faac464708fc061d12e5cd492fcbfea979a)
---
 .../llvm/Target/TargetInstrPredicate.td   | 34 +++
 llvm/utils/TableGen/PredicateExpander.cpp | 34 +++
 llvm/utils/TableGen/PredicateExpander.h   |  4 +++
 3 files changed, 72 insertions(+)

diff --git a/llvm/include/llvm/Target/TargetInstrPredicate.td 
b/llvm/include/llvm/Target/TargetInstrPredicate.td
index 82c4c7b23a49b6a..b5419cb9f3867f0 100644
--- a/llvm/include/llvm/Target/TargetInstrPredicate.td
+++ b/llvm/include/llvm/Target/TargetInstrPredicate.td
@@ -152,6 +152,34 @@ class CheckImmOperand_s : 
CheckOperandBase {
   string ImmVal = Value;
 }
 
+// Check that the operand at position `Index` is less than `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandLT : CheckOperandBase {
+  int ImmVal = Imm;
+}
+
+// Check that the operand at position `Index` is greater than `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandGT : CheckOperandBase {
+  int ImmVal = Imm;
+}
+
+// Check that the operand at position `Index` is less than or equal to `Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandLE : 
CheckNot>;
+
+// Check that the operand at position `Index` is greater than or equal to 
`Imm`.
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against `Imm`.
+class CheckImmOperandGE : 
CheckNot>;
+
 // Expands to a call to `FunctionMapper` if field `FunctionMapper` is set.
 // Otherwise, it expands to a CheckNot>.
 class CheckRegOperandSimple : CheckOperandBase;
@@ -203,6 +231,12 @@ class CheckAll Sequence>
 class CheckAny Sequence>
 : CheckPredicateSequence;
 
+// Check that the operand at position `Index` is in range [Start, End].
+// If field `FunctionMapper` is a non-empty string, then function
+// `FunctionMapper` is applied to the operand value, and the return value is 
then
+// compared against range [Start, End].
+class CheckImmOperandRange
+  : CheckAll<[CheckImmOperandGE, CheckImmOperandLE]>;
 
 // Used to expand the body of a function predicate. See the definition of
 // TIIPredicate below.
diff --git a/llvm/utils/TableGen/PredicateExpander.cpp 
b/llvm/utils/TableGen/PredicateExpander.cpp
index d3a73e02cd916f8..0b9b6389fe38171 100644
--- a/llvm/utils/TableGen/PredicateExpander.cpp
+++ b/llvm/utils/TableGen/PredicateExpander.cpp
@@ -59,6 +59,30 @@ void 
PredicateExpander::expandCheckImmOperandSimple(raw_ostream &OS,
 OS << ")";
 }
 
+void PredicateExpander::expandCheckImmOperandLT(raw_ostream &OS, int OpIndex,
+int ImmVal,
+StringRef FunctionMapper) {
+  if (!FunctionMapper.empty())
+OS << FunctionMapper << "(";
+  OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex
+ << ").getImm()";
+  if (!FunctionMapper.empty())
+OS << ")";
+  OS << (shouldNegate() ? " >= " : " < ") << ImmVal;
+}
+
+void PredicateExpander::expandCheckImmOperandGT(raw_ostream &OS, int OpIndex,
+int ImmVal,
+StringRef FunctionMapper) {
+  if (!FunctionMapper.empty())
+OS << FunctionMapper << "(";
+  OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex
+ << ").getImm()";
+  if (!FunctionMapper.empty())
+OS << ")";
+  OS << (shouldNegate() ? " <= " : " > ") << ImmVal;
+}
+
 void PredicateExpander::expandCheckRegOperand(raw_ostream &OS, int OpIndex,
   const Record *Reg,
   StringRef FunctionMapper) {
@@ -352,6 +376,16 @@ void PredicateExpander::expandPredicate(raw_ostream &OS, 
const Record *Rec) {
  Rec->getValueAsString("ImmVal"),
  Rec->getValueAsString("FunctionMapper"));
 
+  if (Rec->isSubClassOf("CheckImmOperandLT"))
+return expandCheckImmOperandLT(OS, Rec->getValueAsInt("OpIndex"),
+   R