[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)
inclyc wrote: @eddyz87 Could you please take a look? This has been stalled for a while :) https://github.com/llvm/llvm-project/pull/73668 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] PR for llvm/llvm-project#79507 (PR #79509)
tstellar wrote: Merged: e2521eaa1aac43b3216378d529096f08ac98cf14 3d02473ac538f542fb76c4aff0fb6504398c3f15 e9d99e51834e2bf0b39c23a60f2dba5539edd17b https://github.com/llvm/llvm-project/pull/79509 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] PR for llvm/llvm-project#79507 (PR #79509)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79509 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] e2521ea - [ELF] Implement R_RISCV_TLSDESC for RISC-V
Author: Fangrui Song Date: 2024-01-26T21:34:49-08:00 New Revision: e2521eaa1aac43b3216378d529096f08ac98cf14 URL: https://github.com/llvm/llvm-project/commit/e2521eaa1aac43b3216378d529096f08ac98cf14 DIFF: https://github.com/llvm/llvm-project/commit/e2521eaa1aac43b3216378d529096f08ac98cf14.diff LOG: [ELF] Implement R_RISCV_TLSDESC for RISC-V Support R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL. LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20 location, which requires special handling. We save the value of HI20 to be reused. Two interleaved TLSDESC code sequences, which compilers do not generate, are unsupported. For -no-pie/-pie links, TLSDESC to initial-exec or local-exec optimizations are eligible. Implement the relevant hooks (R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions are converted to NOP while the latter two are converted to a GOT load or a lui+addi. The first two instructions, which would be converted to NOP, are removed instead in the presence of relaxation. Relaxation is eligible as long as the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX, regardless of whether the following instructions have a R_RISCV_RELAX. In addition, for the TLSDESC to LE optimization (`lui a0,; addi a0,a0,`), `lui` can be removed (i.e. use the short form) if hi20 is 0. ``` // TLSDESC to LE/IE optimization .Ltlsdesc_hi2: auipc a4, %tlsdesc_hi(c) # if relax: remove; otherwise, NOP load a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP addi a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2) # if LE && !hi20 {if relax: remove; otherwise, NOP} jalr t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2) add a0, a0, tp ``` The implementation carefully ensures that an instruction unrelated to the current TLSDESC code sequence, if immediately follows a removable instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not converted to NOP. * `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582). * `riscv64-tlsdesc-relax.s` tests linker relaxation. * `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900). Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373 Reviewed By: ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/79239 (cherry picked from commit 1117fdd7c16873eb389e988c6a39ad922bae0fd0) Added: lld/test/ELF/riscv-tlsdesc-gd-mixed.s lld/test/ELF/riscv-tlsdesc-relax.s lld/test/ELF/riscv-tlsdesc.s Modified: lld/ELF/Arch/RISCV.cpp lld/ELF/Relocations.cpp Removed: diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp index a92f7bf590c4b4..8ce92b4badfbd7 100644 --- a/lld/ELF/Arch/RISCV.cpp +++ b/lld/ELF/Arch/RISCV.cpp @@ -61,6 +61,7 @@ enum Op { AUIPC = 0x17, JALR = 0x67, LD = 0x3003, + LUI = 0x37, LW = 0x2003, SRLI = 0x5013, SUB = 0x4033, @@ -73,6 +74,7 @@ enum Reg { X_T0 = 5, X_T1 = 6, X_T2 = 7, + X_A0 = 10, X_T3 = 28, }; @@ -139,6 +141,7 @@ RISCV::RISCV() { tlsGotRel = R_RISCV_TLS_TPREL32; } gotRel = symbolicRel; + tlsDescRel = R_RISCV_TLSDESC; // .got[0] = _DYNAMIC gotHeaderEntriesNum = 1; @@ -207,6 +210,8 @@ int64_t RISCV::getImplicitAddend(const uint8_t *buf, RelType type) const { case R_RISCV_JUMP_SLOT: // These relocations are defined as not having an implicit addend. return 0; + case R_RISCV_TLSDESC: +return config->is64 ? read64le(buf + 8) : read32le(buf + 4); } } @@ -315,6 +320,12 @@ RelExpr RISCV::getRelExpr(const RelType type, const Symbol &s, case R_RISCV_PCREL_LO12_I: case R_RISCV_PCREL_LO12_S: return R_RISCV_PC_INDIRECT; + case R_RISCV_TLSDESC_HI20: + case R_RISCV_TLSDESC_LOAD_LO12: + case R_RISCV_TLSDESC_ADD_LO12: +return R_TLSDESC_PC; + case R_RISCV_TLSDESC_CALL: +return R_TLSDESC_CALL; case R_RISCV_TLS_GD_HI20: return R_TLSGD_PC; case R_RISCV_TLS_GOT_HI20: @@ -439,6 +450,7 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const { case R_RISCV_GOT_HI20: case R_RISCV_PCREL_HI20: + case R_RISCV_TLSDESC_HI20: case R_RISCV_TLS_GD_HI20: case R_RISCV_TLS_GOT_HI20: case R_RISCV_TPREL_HI20: @@ -450,6 +462,8 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const { } case R_RISCV_PCREL_LO12_I: + case R_RISCV_TLSDESC_LOAD_LO12: + case R_RISCV_TLSDESC_ADD_LO12: case R_RISCV_TPREL_LO12_I: case R_RISCV_LO12_I: { uint64_t hi = (val + 0x800) >> 12; @@ -533,8 +547,14 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const { break; case R_RISCV_RELAX: -return; // Ignored (for now) - +return; + case R_RISCV_TLSDESC: +// The addend is stored in the second word. +if (config->is64) +
[llvm-branch-commits] [lld] 3d02473 - [ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC
Author: Fangrui Song Date: 2024-01-26T21:34:49-08:00 New Revision: 3d02473ac538f542fb76c4aff0fb6504398c3f15 URL: https://github.com/llvm/llvm-project/commit/3d02473ac538f542fb76c4aff0fb6504398c3f15 DIFF: https://github.com/llvm/llvm-project/commit/3d02473ac538f542fb76c4aff0fb6504398c3f15.diff LOG: [ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC (cherry picked from commit 849951f8759171cb6c74d3ccbcf154506fc1f0ae) Added: Modified: lld/ELF/Relocations.cpp Removed: diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp index b6a317bc3b6d697..3490a701d7189f5 100644 --- a/lld/ELF/Relocations.cpp +++ b/lld/ELF/Relocations.cpp @@ -1286,17 +1286,16 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym, } // ARM, Hexagon, LoongArch and RISC-V do not support GD/LD to IE/LE - // relaxation. + // optimizations. // For PPC64, if the file has missing R_PPC64_TLSGD/R_PPC64_TLSLD, disable - // relaxation as well. - bool toExecRelax = !config->shared && config->emachine != EM_ARM && - config->emachine != EM_HEXAGON && - config->emachine != EM_LOONGARCH && - config->emachine != EM_RISCV && - !c.file->ppc64DisableTLSRelax; + // optimization as well. + bool execOptimize = + !config->shared && config->emachine != EM_ARM && + config->emachine != EM_HEXAGON && config->emachine != EM_LOONGARCH && + config->emachine != EM_RISCV && !c.file->ppc64DisableTLSRelax; // If we are producing an executable and the symbol is non-preemptable, it - // must be defined and the code sequence can be relaxed to use Local-Exec. + // must be defined and the code sequence can be optimized to use Local-Exec. // // ARM and RISC-V do not support any relaxations for TLS relocations, however, // we can omit the DTPMOD dynamic relocations and resolve them at link time @@ -1309,8 +1308,8 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym, // module index, with a special value of 0 for the current module. GOT[e1] is // unused. There only needs to be one module index entry. if (oneof(expr)) { -// Local-Dynamic relocs can be relaxed to Local-Exec. -if (toExecRelax) { +// Local-Dynamic relocs can be optimized to Local-Exec. +if (execOptimize) { c.addReloc({target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE), type, offset, addend, &sym}); return target->getTlsGdRelaxSkip(type); @@ -1322,16 +1321,17 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym, return 1; } - // Local-Dynamic relocs can be relaxed to Local-Exec. + // Local-Dynamic relocs can be optimized to Local-Exec. if (expr == R_DTPREL) { -if (toExecRelax) +if (execOptimize) expr = target->adjustTlsExpr(type, R_RELAX_TLS_LD_TO_LE); c.addReloc({expr, type, offset, addend, &sym}); return 1; } // Local-Dynamic sequence where offset of tls variable relative to dynamic - // thread pointer is stored in the got. This cannot be relaxed to Local-Exec. + // thread pointer is stored in the got. This cannot be optimized to + // Local-Exec. if (expr == R_TLSLD_GOT_OFF) { sym.setFlags(NEEDS_GOT_DTPREL); c.addReloc({expr, type, offset, addend, &sym}); @@ -1341,13 +1341,13 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym, if (oneof(expr)) { -if (!toExecRelax) { +if (!execOptimize) { sym.setFlags(NEEDS_TLSGD); c.addReloc({expr, type, offset, addend, &sym}); return 1; } -// Global-Dynamic relocs can be relaxed to Initial-Exec or Local-Exec +// Global-Dynamic/TLSDESC can be optimized to Initial-Exec or Local-Exec // depending on the symbol being locally defined or not. if (sym.isPreemptible) { sym.setFlags(NEEDS_TLSGD_TO_IE); @@ -1363,9 +1363,9 @@ static unsigned handleTlsRelocation(RelType type, Symbol &sym, if (oneof(expr)) { ctx.hasTlsIe.store(true, std::memory_order_relaxed); -// Initial-Exec relocs can be relaxed to Local-Exec if the symbol is locally -// defined. -if (toExecRelax && isLocalInExecutable) { +// Initial-Exec relocs can be optimized to Local-Exec if the symbol is +// locally defined. +if (execOptimize && isLocalInExecutable) { c.addReloc({R_RELAX_TLS_IE_TO_LE, type, offset, addend, &sym}); } else if (expr != R_TLSIE_HINT) { sym.setFlags(NEEDS_TLSIE); @@ -1463,7 +1463,7 @@ template void RelocationScanner::scanOne(RelTy *&i) { in.got->hasGotOffRel.store(true, std::memory_order_relaxed); } - // Process TLS relocations, including relaxing TLS relocations. Note that + // Process TLS relocations, including TLS optimizations. Note that // R_TPREL and R_TPREL_NEG relocations are resolved in processAux. if (sym.isTls()) { if (unsigne
[llvm-branch-commits] [lld] e9d99e5 - [ELF] Clean up R_RISCV_RELAX code. NFC
Author: Fangrui Song Date: 2024-01-26T21:34:48-08:00 New Revision: e9d99e51834e2bf0b39c23a60f2dba5539edd17b URL: https://github.com/llvm/llvm-project/commit/e9d99e51834e2bf0b39c23a60f2dba5539edd17b DIFF: https://github.com/llvm/llvm-project/commit/e9d99e51834e2bf0b39c23a60f2dba5539edd17b.diff LOG: [ELF] Clean up R_RISCV_RELAX code. NFC (cherry picked from commit ccb99f221422b8de5e1ae04d3427f15878f7cd93) Added: Modified: lld/ELF/Arch/RISCV.cpp Removed: diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp index d7d3d3e4781497..a92f7bf590c4b4 100644 --- a/lld/ELF/Arch/RISCV.cpp +++ b/lld/ELF/Arch/RISCV.cpp @@ -102,6 +102,26 @@ static uint32_t setLO12_S(uint32_t insn, uint32_t imm) { (extractBits(imm, 4, 0) << 7); } +namespace { +struct SymbolAnchor { + uint64_t offset; + Defined *d; + bool end; // true for the anchor of st_value+st_size +}; +} // namespace + +struct elf::RISCVRelaxAux { + // This records symbol start and end offsets which will be adjusted according + // to the nearest relocDeltas element. + SmallVector anchors; + // For relocations[i], the actual offset is + // r_offset - (i ? relocDeltas[i-1] : 0). + std::unique_ptr relocDeltas; + // For relocations[i], the actual type is relocTypes[i]. + std::unique_ptr relocTypes; + SmallVector writes; +}; + RISCV::RISCV() { copyRel = R_RISCV_COPY; pltRel = R_RISCV_JUMP_SLOT; @@ -520,14 +540,19 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const { } } +static bool relaxable(ArrayRef relocs, size_t i) { + return i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX; +} + void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const { uint64_t secAddr = sec.getOutputSection()->addr; if (auto *s = dyn_cast(&sec)) secAddr += s->outSecOff; else if (auto *ehIn = dyn_cast(&sec)) secAddr += ehIn->getParent()->outSecOff; - for (size_t i = 0, size = sec.relocs().size(); i != size; ++i) { -const Relocation &rel = sec.relocs()[i]; + const ArrayRef relocs = sec.relocs(); + for (size_t i = 0, size = relocs.size(); i != size; ++i) { +const Relocation &rel = relocs[i]; uint8_t *loc = buf + rel.offset; const uint64_t val = sec.getRelocTargetVA(sec.file, rel.type, rel.addend, @@ -538,7 +563,7 @@ void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const { break; case R_RISCV_LEB128: if (i + 1 < size) { -const Relocation &rel1 = sec.relocs()[i + 1]; +const Relocation &rel1 = relocs[i + 1]; if (rel.type == R_RISCV_SET_ULEB128 && rel1.type == R_RISCV_SUB_ULEB128 && rel.offset == rel1.offset) { auto val = rel.sym->getVA(rel.addend) - rel1.sym->getVA(rel1.addend); @@ -560,26 +585,6 @@ void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const { } } -namespace { -struct SymbolAnchor { - uint64_t offset; - Defined *d; - bool end; // true for the anchor of st_value+st_size -}; -} // namespace - -struct elf::RISCVRelaxAux { - // This records symbol start and end offsets which will be adjusted according - // to the nearest relocDeltas element. - SmallVector anchors; - // For relocations[i], the actual offset is r_offset - (i ? relocDeltas[i-1] : - // 0). - std::unique_ptr relocDeltas; - // For relocations[i], the actual type is relocTypes[i]. - std::unique_ptr relocTypes; - SmallVector writes; -}; - static void initSymbolAnchors() { SmallVector storage; for (OutputSection *osec : outputSections) { @@ -715,14 +720,15 @@ static void relaxHi20Lo12(const InputSection &sec, size_t i, uint64_t loc, static bool relax(InputSection &sec) { const uint64_t secAddr = sec.getVA(); + const MutableArrayRef relocs = sec.relocs(); auto &aux = *sec.relaxAux; bool changed = false; ArrayRef sa = ArrayRef(aux.anchors); uint64_t delta = 0; - std::fill_n(aux.relocTypes.get(), sec.relocs().size(), R_RISCV_NONE); + std::fill_n(aux.relocTypes.get(), relocs.size(), R_RISCV_NONE); aux.writes.clear(); - for (auto [i, r] : llvm::enumerate(sec.relocs())) { + for (auto [i, r] : llvm::enumerate(relocs)) { const uint64_t loc = secAddr + r.offset - delta; uint32_t &cur = aux.relocDeltas[i], remove = 0; switch (r.type) { @@ -743,23 +749,20 @@ static bool relax(InputSection &sec) { } case R_RISCV_CALL: case R_RISCV_CALL_PLT: - if (i + 1 != sec.relocs().size() && - sec.relocs()[i + 1].type == R_RISCV_RELAX) + if (relaxable(relocs, i)) relaxCall(sec, i, loc, r, remove); break; case R_RISCV_TPREL_HI20: case R_RISCV_TPREL_ADD: case R_RISCV_TPREL_LO12_I: case R_RISCV_TPREL_LO12_S: - if (i + 1 != sec.relocs().size() && - sec.relocs()[i + 1].type == R_RISCV_RELAX) + if (relaxable(relocs, i)) relaxTlsLe(sec, i, loc, r, remove
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79596)
tstellar wrote: @topperc This backport has some failing tests that will have to be fixed up manually and submitted in another PR. https://github.com/llvm/llvm-project/pull/79596 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79451 (PR #79457)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/79457 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79451 (PR #79457)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79451 --- Patch is 44.33 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79457.diff 8 Files Affected: - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+37-2) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+49-19) - (modified) llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h (+9-23) - (modified) llvm/test/CodeGen/AMDGPU/indirect-call-known-callees.ll (-1) - (added) llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-hsa.ll (+295) - (added) llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-pal.ll (+187) - (removed) llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics.ll (-128) - (modified) llvm/test/CodeGen/AMDGPU/workgroup-id-in-arch-sgprs.ll (+50-79) ``diff diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 32921bb248caf07..615685822f91eeb 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -4178,10 +4178,45 @@ bool AMDGPULegalizerInfo::loadInputValue( Register DstReg, MachineIRBuilder &B, AMDGPUFunctionArgInfo::PreloadedValue ArgType) const { const SIMachineFunctionInfo *MFI = B.getMF().getInfo(); - const ArgDescriptor *Arg; + const ArgDescriptor *Arg = nullptr; const TargetRegisterClass *ArgRC; LLT ArgTy; - std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType); + + CallingConv::ID CC = B.getMF().getFunction().getCallingConv(); + const ArgDescriptor WorkGroupIDX = + ArgDescriptor::createRegister(AMDGPU::TTMP9); + // If GridZ is not programmed in an entry function then the hardware will set + // it to all zeros, so there is no need to mask the GridY value in the low + // order bits. + const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister( + AMDGPU::TTMP7, + AMDGPU::isEntryFunctionCC(CC) && !MFI->hasWorkGroupIDZ() ? ~0u : 0xu); + const ArgDescriptor WorkGroupIDZ = + ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu); + if (ST.hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) { +switch (ArgType) { +case AMDGPUFunctionArgInfo::WORKGROUP_ID_X: + Arg = &WorkGroupIDX; + ArgRC = &AMDGPU::SReg_32RegClass; + ArgTy = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y: + Arg = &WorkGroupIDY; + ArgRC = &AMDGPU::SReg_32RegClass; + ArgTy = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z: + Arg = &WorkGroupIDZ; + ArgRC = &AMDGPU::SReg_32RegClass; + ArgTy = LLT::scalar(32); + break; +default: + break; +} + } + + if (!Arg) +std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType); if (!Arg) { if (ArgType == AMDGPUFunctionArgInfo::KERNARG_SEGMENT_PTR) { diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index d35b76c8ad54ebc..d60f511302613e1 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -2072,11 +2072,45 @@ SDValue SITargetLowering::getPreloadedValue(SelectionDAG &DAG, const SIMachineFunctionInfo &MFI, EVT VT, AMDGPUFunctionArgInfo::PreloadedValue PVID) const { - const ArgDescriptor *Reg; + const ArgDescriptor *Reg = nullptr; const TargetRegisterClass *RC; LLT Ty; - std::tie(Reg, RC, Ty) = MFI.getPreloadedValue(PVID); + CallingConv::ID CC = DAG.getMachineFunction().getFunction().getCallingConv(); + const ArgDescriptor WorkGroupIDX = + ArgDescriptor::createRegister(AMDGPU::TTMP9); + // If GridZ is not programmed in an entry function then the hardware will set + // it to all zeros, so there is no need to mask the GridY value in the low + // order bits. + const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister( + AMDGPU::TTMP7, + AMDGPU::isEntryFunctionCC(CC) && !MFI.hasWorkGroupIDZ() ? ~0u : 0xu); + const ArgDescriptor WorkGroupIDZ = + ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu); + if (Subtarget->hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) { +switch (PVID) { +case AMDGPUFunctionArgInfo::WORKGROUP_ID_X: + Reg = &WorkGroupIDX; + RC = &AMDGPU::SReg_32RegClass; + Ty = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y: + Reg = &WorkGroupIDY; + RC = &AMDGPU::SReg_32RegClass; + Ty = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z: + Reg = &WorkGroupIDZ; + RC = &AMDGPU::SReg_32RegClass; + Ty = LLT::scalar(32); + break; +default: + break; +} + } + + if (!Reg) +std::tie(Reg, RC, Ty) = MFI.getPreloadedValue(PVID); if (!Reg) { if (PVID == AMDGPUFunctionArgInfo::PreloadedValue::KERNARG_SEGMENT_PTR) { // It's possible for a ker
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79451 (PR #79457)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/79457 >From 585d833f346e34aa5bb9d732467d71e4177a8a50 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Wed, 24 Jan 2024 15:06:20 + Subject: [PATCH] [AMDGPU] Move architected SGPR implementation into isel (#79120) (cherry picked from commit 70fc9703788e8965813c5b677a85cb84b66671b6) --- .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 39 ++- llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 68 ++-- .../lib/Target/AMDGPU/SIMachineFunctionInfo.h | 32 +- .../AMDGPU/indirect-call-known-callees.ll | 1 - .../lower-work-group-id-intrinsics-hsa.ll | 295 ++ .../lower-work-group-id-intrinsics-pal.ll | 187 +++ .../AMDGPU/lower-work-group-id-intrinsics.ll | 128 .../AMDGPU/workgroup-id-in-arch-sgprs.ll | 129 +++- 8 files changed, 627 insertions(+), 252 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-hsa.ll create mode 100644 llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics-pal.ll delete mode 100644 llvm/test/CodeGen/AMDGPU/lower-work-group-id-intrinsics.ll diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 32921bb248caf07..615685822f91eeb 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -4178,10 +4178,45 @@ bool AMDGPULegalizerInfo::loadInputValue( Register DstReg, MachineIRBuilder &B, AMDGPUFunctionArgInfo::PreloadedValue ArgType) const { const SIMachineFunctionInfo *MFI = B.getMF().getInfo(); - const ArgDescriptor *Arg; + const ArgDescriptor *Arg = nullptr; const TargetRegisterClass *ArgRC; LLT ArgTy; - std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType); + + CallingConv::ID CC = B.getMF().getFunction().getCallingConv(); + const ArgDescriptor WorkGroupIDX = + ArgDescriptor::createRegister(AMDGPU::TTMP9); + // If GridZ is not programmed in an entry function then the hardware will set + // it to all zeros, so there is no need to mask the GridY value in the low + // order bits. + const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister( + AMDGPU::TTMP7, + AMDGPU::isEntryFunctionCC(CC) && !MFI->hasWorkGroupIDZ() ? ~0u : 0xu); + const ArgDescriptor WorkGroupIDZ = + ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu); + if (ST.hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) { +switch (ArgType) { +case AMDGPUFunctionArgInfo::WORKGROUP_ID_X: + Arg = &WorkGroupIDX; + ArgRC = &AMDGPU::SReg_32RegClass; + ArgTy = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y: + Arg = &WorkGroupIDY; + ArgRC = &AMDGPU::SReg_32RegClass; + ArgTy = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z: + Arg = &WorkGroupIDZ; + ArgRC = &AMDGPU::SReg_32RegClass; + ArgTy = LLT::scalar(32); + break; +default: + break; +} + } + + if (!Arg) +std::tie(Arg, ArgRC, ArgTy) = MFI->getPreloadedValue(ArgType); if (!Arg) { if (ArgType == AMDGPUFunctionArgInfo::KERNARG_SEGMENT_PTR) { diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index d35b76c8ad54ebc..d60f511302613e1 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -2072,11 +2072,45 @@ SDValue SITargetLowering::getPreloadedValue(SelectionDAG &DAG, const SIMachineFunctionInfo &MFI, EVT VT, AMDGPUFunctionArgInfo::PreloadedValue PVID) const { - const ArgDescriptor *Reg; + const ArgDescriptor *Reg = nullptr; const TargetRegisterClass *RC; LLT Ty; - std::tie(Reg, RC, Ty) = MFI.getPreloadedValue(PVID); + CallingConv::ID CC = DAG.getMachineFunction().getFunction().getCallingConv(); + const ArgDescriptor WorkGroupIDX = + ArgDescriptor::createRegister(AMDGPU::TTMP9); + // If GridZ is not programmed in an entry function then the hardware will set + // it to all zeros, so there is no need to mask the GridY value in the low + // order bits. + const ArgDescriptor WorkGroupIDY = ArgDescriptor::createRegister( + AMDGPU::TTMP7, + AMDGPU::isEntryFunctionCC(CC) && !MFI.hasWorkGroupIDZ() ? ~0u : 0xu); + const ArgDescriptor WorkGroupIDZ = + ArgDescriptor::createRegister(AMDGPU::TTMP7, 0xu); + if (Subtarget->hasArchitectedSGPRs() && AMDGPU::isCompute(CC)) { +switch (PVID) { +case AMDGPUFunctionArgInfo::WORKGROUP_ID_X: + Reg = &WorkGroupIDX; + RC = &AMDGPU::SReg_32RegClass; + Ty = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Y: + Reg = &WorkGroupIDY; + RC = &AMDGPU::SReg_32RegClass; + Ty = LLT::scalar(32); + break; +case AMDGPUFunctionArgInfo::WORKGROUP_ID_Z: + Reg = &WorkGroupIDZ; + RC = &AMDGPU::SReg_32RegClass; + Ty = LLT::sca
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79560 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)
tstellar wrote: Merged: d9e26c223bdb124e1e83b73d87d7a545925fda90 7cfa0c1b7c82135ee5de1555c72d9629e5556c69 https://github.com/llvm/llvm-project/pull/79560 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 7cfa0c1 - [TableGen] Add predicates for immediates comparison (#76004)
Author: Wang Pengcheng Date: 2024-01-26T21:28:00-08:00 New Revision: 7cfa0c1b7c82135ee5de1555c72d9629e5556c69 URL: https://github.com/llvm/llvm-project/commit/7cfa0c1b7c82135ee5de1555c72d9629e5556c69 DIFF: https://github.com/llvm/llvm-project/commit/7cfa0c1b7c82135ee5de1555c72d9629e5556c69.diff LOG: [TableGen] Add predicates for immediates comparison (#76004) These predicates can be used to represent `<`, `<=`, `>`, `>=`. And a predicate for `in range` is added. (cherry picked from commit 664a0faac464708fc061d12e5cd492fcbfea979a) Added: Modified: llvm/include/llvm/Target/TargetInstrPredicate.td llvm/utils/TableGen/PredicateExpander.cpp llvm/utils/TableGen/PredicateExpander.h Removed: diff --git a/llvm/include/llvm/Target/TargetInstrPredicate.td b/llvm/include/llvm/Target/TargetInstrPredicate.td index 82c4c7b23a49b6a..b5419cb9f3867f0 100644 --- a/llvm/include/llvm/Target/TargetInstrPredicate.td +++ b/llvm/include/llvm/Target/TargetInstrPredicate.td @@ -152,6 +152,34 @@ class CheckImmOperand_s : CheckOperandBase { string ImmVal = Value; } +// Check that the operand at position `Index` is less than `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandLT : CheckOperandBase { + int ImmVal = Imm; +} + +// Check that the operand at position `Index` is greater than `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandGT : CheckOperandBase { + int ImmVal = Imm; +} + +// Check that the operand at position `Index` is less than or equal to `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandLE : CheckNot>; + +// Check that the operand at position `Index` is greater than or equal to `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandGE : CheckNot>; + // Expands to a call to `FunctionMapper` if field `FunctionMapper` is set. // Otherwise, it expands to a CheckNot>. class CheckRegOperandSimple : CheckOperandBase; @@ -203,6 +231,12 @@ class CheckAll Sequence> class CheckAny Sequence> : CheckPredicateSequence; +// Check that the operand at position `Index` is in range [Start, End]. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against range [Start, End]. +class CheckImmOperandRange + : CheckAll<[CheckImmOperandGE, CheckImmOperandLE]>; // Used to expand the body of a function predicate. See the definition of // TIIPredicate below. diff --git a/llvm/utils/TableGen/PredicateExpander.cpp b/llvm/utils/TableGen/PredicateExpander.cpp index d3a73e02cd916f8..0b9b6389fe38171 100644 --- a/llvm/utils/TableGen/PredicateExpander.cpp +++ b/llvm/utils/TableGen/PredicateExpander.cpp @@ -59,6 +59,30 @@ void PredicateExpander::expandCheckImmOperandSimple(raw_ostream &OS, OS << ")"; } +void PredicateExpander::expandCheckImmOperandLT(raw_ostream &OS, int OpIndex, +int ImmVal, +StringRef FunctionMapper) { + if (!FunctionMapper.empty()) +OS << FunctionMapper << "("; + OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex + << ").getImm()"; + if (!FunctionMapper.empty()) +OS << ")"; + OS << (shouldNegate() ? " >= " : " < ") << ImmVal; +} + +void PredicateExpander::expandCheckImmOperandGT(raw_ostream &OS, int OpIndex, +int ImmVal, +StringRef FunctionMapper) { + if (!FunctionMapper.empty()) +OS << FunctionMapper << "("; + OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex + << ").getImm()"; + if (!FunctionMapper.empty()) +OS << ")"; + OS << (shouldNegate() ? " <= " : " > ") << ImmVal; +} + void PredicateExpander::expandCheckRegOperand(raw_ostream &OS, int OpIndex, const Record *Reg, StringRef FunctionMapper) { @@ -352,6 +376,16 @@ void PredicateExpander::expandPredicate(raw_ostream &OS, const Record *Rec) { Rec->getValueAsString("ImmVal"), Rec->getValueAsString("FunctionMapper")); + if (Rec->isSubClassOf("CheckImmOperandLT")) +return expandCheckImm
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)
https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/79673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)
MaskRay wrote: LGTM https://github.com/llvm/llvm-project/pull/79673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79355 (PR #79361)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79355 --- Full diff: https://github.com/llvm/llvm-project/pull/79361.diff 2 Files Affected: - (modified) clang/lib/AST/TemplateBase.cpp (+2-1) - (modified) clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp (+18) ``diff diff --git a/clang/lib/AST/TemplateBase.cpp b/clang/lib/AST/TemplateBase.cpp index 2bdbeb08ef20465..3310d7dc24c59d2 100644 --- a/clang/lib/AST/TemplateBase.cpp +++ b/clang/lib/AST/TemplateBase.cpp @@ -450,7 +450,8 @@ bool TemplateArgument::structurallyEquals(const TemplateArgument &Other) const { getAsIntegral() == Other.getAsIntegral(); case StructuralValue: { -if (getStructuralValueType() != Other.getStructuralValueType()) +if (getStructuralValueType().getCanonicalType() != +Other.getStructuralValueType().getCanonicalType()) return false; llvm::FoldingSetNodeID A, B; diff --git a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp index b5b8cadc909ce00..834174cdf6a32dc 100644 --- a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp +++ b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp @@ -336,3 +336,21 @@ template void bar(B b) { (b.operator Tbar(), ...); } } + +namespace ReportedRegression1 { + const char kt[] = "dummy"; + + template +class SomeTempl { }; + + template +class SomeTempl { + public: +int exit_code() const { return 0; } +}; + + int use() { +SomeTempl dummy; +return dummy.exit_code(); + } +} `` https://github.com/llvm/llvm-project/pull/79361 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79355 (PR #79361)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/79361 >From ffe93d6f5ca33044d4118dc589571fadf7e2fc81 Mon Sep 17 00:00:00 2001 From: erichkeane Date: Wed, 24 Jan 2024 12:07:22 -0800 Subject: [PATCH] Fix comparison of Structural Values Fixes a regression from #78041 as reported in the review. The original patch failed to compare the canonical type, which this adds. A slightly modified test of the original report is added. (cherry picked from commit e3ee3762304aa81e4a240500844bfdd003401b36) --- clang/lib/AST/TemplateBase.cpp | 3 ++- .../SemaTemplate/temp_arg_nontype_cxx20.cpp| 18 ++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/clang/lib/AST/TemplateBase.cpp b/clang/lib/AST/TemplateBase.cpp index 2bdbeb08ef2046..3310d7dc24c59d 100644 --- a/clang/lib/AST/TemplateBase.cpp +++ b/clang/lib/AST/TemplateBase.cpp @@ -450,7 +450,8 @@ bool TemplateArgument::structurallyEquals(const TemplateArgument &Other) const { getAsIntegral() == Other.getAsIntegral(); case StructuralValue: { -if (getStructuralValueType() != Other.getStructuralValueType()) +if (getStructuralValueType().getCanonicalType() != +Other.getStructuralValueType().getCanonicalType()) return false; llvm::FoldingSetNodeID A, B; diff --git a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp index b5b8cadc909ce0..834174cdf6a32d 100644 --- a/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp +++ b/clang/test/SemaTemplate/temp_arg_nontype_cxx20.cpp @@ -336,3 +336,21 @@ template void bar(B b) { (b.operator Tbar(), ...); } } + +namespace ReportedRegression1 { + const char kt[] = "dummy"; + + template +class SomeTempl { }; + + template +class SomeTempl { + public: +int exit_code() const { return 0; } +}; + + int use() { +SomeTempl dummy; +return dummy.exit_code(); + } +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [llvm] [clang] PR for llvm/llvm-project#79293 (PR #79461)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79461 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [clang] [llvm] PR for llvm/llvm-project#79293 (PR #79461)
tstellar wrote: Merged: ed48280f8e9dd0fc2a21c3ee24c8db3fe12702f8 https://github.com/llvm/llvm-project/pull/79461 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79277 (PR #79340)
llvmbot wrote: @llvm/pr-subscribers-mc @llvm/pr-subscribers-clang Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79277 --- Patch is 116.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79340.diff 34 Files Affected: - (modified) clang/test/CodeGenOpenCL/amdgpu-features.cl (+2-2) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-fp8.cl (+18-17) - (modified) llvm/lib/Target/AMDGPU/AMDGPU.td (+1) - (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+29-2) - (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+26) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+10) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+5-2) - (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+11) - (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+3) - (modified) llvm/lib/Target/AMDGPU/VOP1Instructions.td (+88-5) - (modified) llvm/lib/Target/AMDGPU/VOP3Instructions.td (+49-4) - (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+20-9) - (modified) llvm/lib/TargetParser/TargetParser.cpp (+1) - (added) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.dpp.ll (+142) - (added) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.dpp.mir (+197) - (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cvt.fp8.ll (+405-61) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1.s (+45) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_dpp16.s (+12) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop1_dpp8.s (+12) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3.s (+36) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_dpp16.s (+108) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_dpp8.s (+48) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1.s (+138) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1_dpp16.s (+12) - (modified) llvm/test/MC/AMDGPU/gfx12_asm_vop3_from_vop1_dpp8.s (+12) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1.txt (+36) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp16.txt (+12) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp8.txt (+12) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3.txt (+36) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp16.txt (+108) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp8.txt (+48) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1.txt (+36) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1_dpp16.txt (+12) - (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_from_vop1_dpp8.txt (+12) ``diff diff --git a/clang/test/CodeGenOpenCL/amdgpu-features.cl b/clang/test/CodeGenOpenCL/amdgpu-features.cl index 1ba2b129f6895ae..9c8ca0bb96f6125 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-features.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-features.cl @@ -100,8 +100,8 @@ // GFX1103: "target-features"="+16-bit-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot10-insts,+dot5-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32" // GFX1150: "target-features"="+16-bit-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot10-insts,+dot5-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32" // GFX1151: "target-features"="+16-bit-insts,+atomic-fadd-rtn-insts,+ci-insts,+dl-insts,+dot10-insts,+dot5-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32" -// GFX1200: "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot10-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx12-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32" -// GFX1201: "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot10-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx12-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32" +// GFX1200: "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot10-insts,+dot7-insts,+dot8-insts,+dot9-insts,+dpp,+fp8-conversion-insts,+gfx10-3-insts,+gfx10-insts,+gfx11-insts,+gfx12-insts,+gfx8-insts,+gfx9-insts,+wavefrontsize32" +// GFX1201: "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-ins
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79632 (PR #79633)
tstellar wrote: Merged: aa4cb0e313a621bb9c2e5849de82684ed3c73c29 https://github.com/llvm/llvm-project/pull/79633 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79633 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] aa4cb0e - [Driver, CodeGen] Support -mtls-dialect= (#79256)
Author: Fangrui Song Date: 2024-01-26T19:51:03-08:00 New Revision: aa4cb0e313a621bb9c2e5849de82684ed3c73c29 URL: https://github.com/llvm/llvm-project/commit/aa4cb0e313a621bb9c2e5849de82684ed3c73c29 DIFF: https://github.com/llvm/llvm-project/commit/aa4cb0e313a621bb9c2e5849de82684ed3c73c29.diff LOG: [Driver,CodeGen] Support -mtls-dialect= (#79256) GCC supports -mtls-dialect= for several architectures to select TLSDESC. This patch supports the following values * x86: "gnu". "gnu2" (TLSDESC) is not supported yet. * RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915) AArch64 toolchains seem to support TLSDESC from the beginning, and the general dynamic model has poor support. Nobody seems to use the option -mtls-dialect= at all, so we don't bother with it. There also seems very little interest in AArch32's TLSDESC support. TLSDESC does not change IR, but affects object file generation. Without a backend option the option is a no-op for in-process ThinLTO. There seems no motivation to have fine-grained control mixing trad/desc for TLS, so we just pass -mllvm, and don't bother with a modules flag metadata or function attribute. Co-authored-by: Paul Kirth (cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20) Added: clang/test/CodeGen/RISCV/tls-dialect.c clang/test/Driver/tls-dialect.c Modified: clang/include/clang/Basic/CodeGenOptions.def clang/include/clang/Driver/Options.td clang/lib/CodeGen/BackendUtil.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/CommonArgs.h llvm/include/llvm/TargetParser/Triple.h Removed: diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 2f2e45d5cf63df..7c0bfe32849614 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibr /// The default TLS model to use. ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel) +/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this value. +CODEGENOPT(EnableTLSDESC, 1, 0) + /// Bit size of immediate TLS offsets (0 == use the default). VALUE_CODEGENOPT(TLSSize, 8, 0) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 7f4fa33748faca..773bc1dcda01d5 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, Group, HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): " "12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 256TB, needs -mcmodel=large)">, MarshallingInfoInt>; +def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group, + Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use for dynamic accesses of TLS variables">; def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group; def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, Group; def mno_default_build_attributes : Joined<["-"], "mno-default-build-attributes">, Group; @@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], "fexperimental-assignme Values<"disabled,enabled,forced">, NormalizedValues<["Disabled","Enabled","Forced"]>, MarshallingInfoEnum, "Enabled">; +def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">, + MarshallingInfoFlag>; + } // let Visibility = [CC1Option] //===--===// diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index ec203f6f28bc17..7877e20d77f772 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags, Options.UniqueBasicBlockSectionNames = CodeGenOpts.UniqueBasicBlockSectionNames; Options.TLSSize = CodeGenOpts.TLSSize; + Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC; Options.EmulatedTLS = CodeGenOpts.EmulatedTLS; Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning(); Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection; diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 5dc614e11aab59..8092fc050b0ee6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ); } + if (isTLSDESCEnabled(TC, Args)) +CmdArgs.push_back("-enable-tlsdesc"); + // Add the target cpu std::string CPU = getCPUName(D, Args, Triple, /*FromAs*/
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)
llvmbot wrote: @llvm/pr-subscribers-backend-x86 Author: None (llvmbot) Changes resolves llvm/llvm-project#79629 --- Full diff: https://github.com/llvm/llvm-project/pull/79673.diff 2 Files Affected: - (modified) llvm/lib/Target/X86/X86AsmPrinter.cpp (-1) - (added) llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll (+30) ``diff diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp index 9f0fd4d0938e97f..87ec8aa23080e00 100644 --- a/llvm/lib/Target/X86/X86AsmPrinter.cpp +++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp @@ -877,7 +877,6 @@ void X86AsmPrinter::emitStartOfAsmFile(Module &M) { OutStreamer->emitInt32(FeatureFlagsAnd);// data emitAlignment(WordSize == 4 ? Align(4) : Align(8)); // padding - OutStreamer->endSection(Nt); OutStreamer->switchSection(Cur); } } diff --git a/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll new file mode 100644 index 000..a0e5b4add1b386e --- /dev/null +++ b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll @@ -0,0 +1,30 @@ +; RUN: llc -mtriple x86_64-unknown-linux-gnu %s -o %t.o -filetype=obj +; RUN: llvm-readobj -n %t.o | FileCheck %s + +module asm ".pushsection \22.note.gnu.property\22,\22a\22,@note" +module asm " .p2align 3" +module asm " .long 1f - 0f" +module asm " .long 4f - 1f" +module asm " .long 5" +module asm "0: .asciz \22GNU\22" +module asm "1: .p2align 3" +module asm " .long 0xc0008002" +module asm " .long 3f - 2f" +module asm "2: .long ((1U << 0) | 0 | 0 | 0)" +module asm "3: .p2align 3" +module asm "4:" +module asm " .popsection" + +!llvm.module.flags = !{!0, !1} + +!0 = !{i32 4, !"cf-protection-return", i32 1} +!1 = !{i32 4, !"cf-protection-branch", i32 1} + +; CHECK: Type: NT_GNU_PROPERTY_TYPE_0 +; CHECK-NEXT: Property [ +; CHECK-NEXT: x86 feature: IBT, SHSTK +; CHECK-NEXT: ] +; CHECK: Type: NT_GNU_PROPERTY_TYPE_0 +; CHECK-NEXT: Property [ +; CHECK-NEXT: x86 ISA needed: x86-64-baseline +; CHECK-NEXT: ] `` https://github.com/llvm/llvm-project/pull/79673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)
llvmbot wrote: @MaskRay What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/79673 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79629 (PR #79673)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/79673 resolves llvm/llvm-project#79629 >From ee4face03101cd785a668bd1cb78e2332cc5b248 Mon Sep 17 00:00:00 2001 From: Adhemerval Zanella Date: Fri, 26 Jan 2024 10:33:47 -0800 Subject: [PATCH] [X86] Do not end 'note.gnu.property' section with -fcf-protection (#79360) The glibc now adds the required minimum ISA level for libc-nonshared.a (linked on all programs) and this is done with an inline asm along with .note.gnu.property and .pushsection/.popsection. However, the x86 backend always ends the 'note.gnu.property' section when building with -fcf-protection, leading to assert failure: llvm/llvm-project-git/llvm/lib/MC/MCStreamer.cpp:1251: virtual void llvm::MCStreamer::switchSection(llvm::MCSection*, const llvm::MCExpr*): Assertion `!Section->hasEnded() && "Section already ended"' failed. [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/isa-level.c;h=3f1b269848a52f994275bab6f60dded3ded6b144;hb=HEAD (cherry picked from commit a58c62fa824fd24d20fa2366e0ec8f241cb321fe) --- llvm/lib/Target/X86/X86AsmPrinter.cpp | 1 - .../X86/note-cet-property-inlineasm.ll| 30 +++ 2 files changed, 30 insertions(+), 1 deletion(-) create mode 100644 llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp index 9f0fd4d0938e97f..87ec8aa23080e00 100644 --- a/llvm/lib/Target/X86/X86AsmPrinter.cpp +++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp @@ -877,7 +877,6 @@ void X86AsmPrinter::emitStartOfAsmFile(Module &M) { OutStreamer->emitInt32(FeatureFlagsAnd);// data emitAlignment(WordSize == 4 ? Align(4) : Align(8)); // padding - OutStreamer->endSection(Nt); OutStreamer->switchSection(Cur); } } diff --git a/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll new file mode 100644 index 000..a0e5b4add1b386e --- /dev/null +++ b/llvm/test/CodeGen/X86/note-cet-property-inlineasm.ll @@ -0,0 +1,30 @@ +; RUN: llc -mtriple x86_64-unknown-linux-gnu %s -o %t.o -filetype=obj +; RUN: llvm-readobj -n %t.o | FileCheck %s + +module asm ".pushsection \22.note.gnu.property\22,\22a\22,@note" +module asm " .p2align 3" +module asm " .long 1f - 0f" +module asm " .long 4f - 1f" +module asm " .long 5" +module asm "0: .asciz \22GNU\22" +module asm "1: .p2align 3" +module asm " .long 0xc0008002" +module asm " .long 3f - 2f" +module asm "2: .long ((1U << 0) | 0 | 0 | 0)" +module asm "3: .p2align 3" +module asm "4:" +module asm " .popsection" + +!llvm.module.flags = !{!0, !1} + +!0 = !{i32 4, !"cf-protection-return", i32 1} +!1 = !{i32 4, !"cf-protection-branch", i32 1} + +; CHECK: Type: NT_GNU_PROPERTY_TYPE_0 +; CHECK-NEXT: Property [ +; CHECK-NEXT: x86 feature: IBT, SHSTK +; CHECK-NEXT: ] +; CHECK: Type: NT_GNU_PROPERTY_TYPE_0 +; CHECK-NEXT: Property [ +; CHECK-NEXT: x86 ISA needed: x86-64-baseline +; CHECK-NEXT: ] ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79511 (PR #79513)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79511 --- Full diff: https://github.com/llvm/llvm-project/pull/79513.diff 2 Files Affected: - (modified) clang/lib/Driver/Driver.cpp (+3-3) - (modified) clang/test/Driver/fat-lto-objects.c (+9-3) ``diff diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index 7109faa1072de5f..93cddf742d521d2 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -4764,9 +4764,9 @@ Action *Driver::ConstructPhaseAction( case phases::Backend: { if (isUsingLTO() && TargetDeviceOffloadKind == Action::OFK_None) { types::ID Output; - if (Args.hasArg(options::OPT_ffat_lto_objects)) -Output = Args.hasArg(options::OPT_emit_llvm) ? types::TY_LTO_IR - : types::TY_PP_Asm; + if (Args.hasArg(options::OPT_ffat_lto_objects) && + !Args.hasArg(options::OPT_emit_llvm)) +Output = types::TY_PP_Asm; else if (Args.hasArg(options::OPT_S)) Output = types::TY_LTO_IR; else diff --git a/clang/test/Driver/fat-lto-objects.c b/clang/test/Driver/fat-lto-objects.c index 97002db6edc51e5..d9a5ba88ea6d6f5 100644 --- a/clang/test/Driver/fat-lto-objects.c +++ b/clang/test/Driver/fat-lto-objects.c @@ -23,11 +23,17 @@ // CHECK-CC-S-EL-LTO-SAME: -emit-llvm // CHECK-CC-S-EL-LTO-SAME: -ffat-lto-objects -/// When fat LTO is enabled wihtout -S we expect native object output and -ffat-lto-object to be passed to cc1. +/// When fat LTO is enabled without -S we expect native object output and -ffat-lto-object to be passed to cc1. // RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### %s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-LTO // CHECK-CC-C-LTO: -cc1 -// CHECK-CC-C-LTO: -emit-obj -// CHECK-CC-C-LTO: -ffat-lto-objects +// CHECK-CC-C-LTO-SAME: -emit-obj +// CHECK-CC-C-LTO-SAME: -ffat-lto-objects + +/// When fat LTO is enabled with -c and -emit-llvm we expect bitcode output and -ffat-lto-object to be passed to cc1. +// RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### %s -c -emit-llvm 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-EL-LTO +// CHECK-CC-C-EL-LTO: -cc1 +// CHECK-CC-C-EL-LTO-SAME: -emit-llvm-bc +// CHECK-CC-C-EL-LTO-SAME: -ffat-lto-objects /// Make sure we don't have a warning for -ffat-lto-objects being unused // RUN: %clang --target=x86_64-unknown-linux-gnu -ffat-lto-objects -fdriver-only -Werror -v %s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-NOLTO `` https://github.com/llvm/llvm-project/pull/79513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79511 (PR #79513)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/79513 >From 3c527f29181e349d054bdf39807a97df2e85ef02 Mon Sep 17 00:00:00 2001 From: Sean Fertile <35576261+mandle...@users.noreply.github.com> Date: Thu, 25 Jan 2024 10:50:59 -0500 Subject: [PATCH] [LTO] Fix fat-lto output for -c -emit-llvm. (#79404) Fix and add a test case for combining '-ffat-lto-objects -c -emit-llvm' options and fix a spelling mistake in same test. (cherry picked from commit f1b1611148fa533fe198fec3fa4ef8139224dc80) --- clang/lib/Driver/Driver.cpp | 6 +++--- clang/test/Driver/fat-lto-objects.c | 12 +--- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp index 7109faa1072de5f..93cddf742d521d2 100644 --- a/clang/lib/Driver/Driver.cpp +++ b/clang/lib/Driver/Driver.cpp @@ -4764,9 +4764,9 @@ Action *Driver::ConstructPhaseAction( case phases::Backend: { if (isUsingLTO() && TargetDeviceOffloadKind == Action::OFK_None) { types::ID Output; - if (Args.hasArg(options::OPT_ffat_lto_objects)) -Output = Args.hasArg(options::OPT_emit_llvm) ? types::TY_LTO_IR - : types::TY_PP_Asm; + if (Args.hasArg(options::OPT_ffat_lto_objects) && + !Args.hasArg(options::OPT_emit_llvm)) +Output = types::TY_PP_Asm; else if (Args.hasArg(options::OPT_S)) Output = types::TY_LTO_IR; else diff --git a/clang/test/Driver/fat-lto-objects.c b/clang/test/Driver/fat-lto-objects.c index 97002db6edc51e5..d9a5ba88ea6d6f5 100644 --- a/clang/test/Driver/fat-lto-objects.c +++ b/clang/test/Driver/fat-lto-objects.c @@ -23,11 +23,17 @@ // CHECK-CC-S-EL-LTO-SAME: -emit-llvm // CHECK-CC-S-EL-LTO-SAME: -ffat-lto-objects -/// When fat LTO is enabled wihtout -S we expect native object output and -ffat-lto-object to be passed to cc1. +/// When fat LTO is enabled without -S we expect native object output and -ffat-lto-object to be passed to cc1. // RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### %s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-LTO // CHECK-CC-C-LTO: -cc1 -// CHECK-CC-C-LTO: -emit-obj -// CHECK-CC-C-LTO: -ffat-lto-objects +// CHECK-CC-C-LTO-SAME: -emit-obj +// CHECK-CC-C-LTO-SAME: -ffat-lto-objects + +/// When fat LTO is enabled with -c and -emit-llvm we expect bitcode output and -ffat-lto-object to be passed to cc1. +// RUN: %clang --target=x86_64-unknown-linux-gnu -flto -ffat-lto-objects -### %s -c -emit-llvm 2>&1 | FileCheck %s -check-prefix=CHECK-CC-C-EL-LTO +// CHECK-CC-C-EL-LTO: -cc1 +// CHECK-CC-C-EL-LTO-SAME: -emit-llvm-bc +// CHECK-CC-C-EL-LTO-SAME: -ffat-lto-objects /// Make sure we don't have a warning for -ffat-lto-objects being unused // RUN: %clang --target=x86_64-unknown-linux-gnu -ffat-lto-objects -fdriver-only -Werror -v %s -c 2>&1 | FileCheck %s -check-prefix=CHECK-CC-NOLTO ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (PR #79642)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79642 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 147c623 - Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (#79642)
Author: dyung Date: 2024-01-26T19:27:28-08:00 New Revision: 147c623a86b39d6bc9993293487b5773de943dad URL: https://github.com/llvm/llvm-project/commit/147c623a86b39d6bc9993293487b5773de943dad DIFF: https://github.com/llvm/llvm-project/commit/147c623a86b39d6bc9993293487b5773de943dad.diff LOG: Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (#79642) This test started to fail when LLVM created the release/18.x branch and the main branch subsequently had the version number increased from 18 to 19. I investigated this failure (it was blocking our internal automation) and discovered that the CHECK statement on line 27 seemed to have the compiler version number (1800) encoded in octal that it was checking for. I don't know if this is something that explicitly needs to be checked, so I am leaving it in, but it should be more flexible so the test doesn't fail anytime the version number is changed. To accomplish that, I changed the check for the 4-digit version number to be a regex. I originally updated this test for the 18->19 transition in a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK line more flexible so it doesn't need to be continually updated. (cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874) Added: Modified: llvm/test/CodeGen/SystemZ/zos-ppa2.ll Removed: diff --git a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll index f54f654b804a239..60580aeb6d83cc7 100644 --- a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll +++ b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll @@ -24,7 +24,7 @@ ; CHECK:.byte 0 ; CHECK:.byte 3 ; CHECK:.short 30 -; CHECK:.ascii "\323\323\345\324@@\361\370\360\360\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360" +; CHECK:.ascii "\323\323\345\324@@{{((\\3[0-7]{2}){4})}}\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360" define void @void_test() { entry: ret void ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] [ThinLTO][TypeProf] Implement vtable def import (PR #79381)
@@ -1285,46 +1285,44 @@ void annotateValueSite(Module &M, Instruction &Inst, Inst.setMetadata(LLVMContext::MD_prof, MDNode::get(Ctx, Vals)); } -bool getValueProfDataFromInst(const Instruction &Inst, - InstrProfValueKind ValueKind, - uint32_t MaxNumValueData, - InstrProfValueData ValueData[], - uint32_t &ActualNumValueData, uint64_t &TotalC, - bool GetNoICPValue) { +MDNode *mayHaveValueProfileOfKind(const Instruction &Inst, + InstrProfValueKind ValueKind) { MDNode *MD = Inst.getMetadata(LLVMContext::MD_prof); if (!MD) -return false; +return nullptr; - unsigned NOps = MD->getNumOperands(); + if (MD->getNumOperands() < 5) +return nullptr; - if (NOps < 5) -return false; - - // Operand 0 is a string tag "VP": MDString *Tag = cast(MD->getOperand(0)); - if (!Tag) -return false; - - if (!Tag->getString().equals("VP")) -return false; + if (!Tag || !Tag->getString().equals("VP")) +return nullptr; // Now check kind: ConstantInt *KindInt = mdconst::dyn_extract(MD->getOperand(1)); if (!KindInt) -return false; +return nullptr; if (KindInt->getZExtValue() != ValueKind) -return false; +return nullptr; + + return MD; +} +static bool getValueProfDataFromInst(const MDNode *const MD, minglotus-6 wrote: sounds good! Done. https://github.com/llvm/llvm-project/pull/79381 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] [ThinLTO][TypeProf] Implement vtable def import (PR #79381)
https://github.com/minglotus-6 updated https://github.com/llvm/llvm-project/pull/79381 >From d4caa0997799b712edb11d90c5be79d0aab3c312 Mon Sep 17 00:00:00 2001 From: mingmingl Date: Thu, 25 Jan 2024 13:59:03 -0800 Subject: [PATCH 1/4] Introduce an opton to control the number of vtables to annotate and use it when generating function summaries. Created using spr 1.3.4 --- .../IndirectCallPromotionAnalysis.cpp | 3 +++ llvm/lib/Analysis/ModuleSummaryAnalysis.cpp | 12 - .../thinlto-func-summary-vtableref-pgo.ll | 25 --- 3 files changed, 24 insertions(+), 16 deletions(-) diff --git a/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp b/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp index ebfa1c8fc08e1c..18cb6a220e3bd0 100644 --- a/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp +++ b/llvm/lib/Analysis/IndirectCallPromotionAnalysis.cpp @@ -45,6 +45,9 @@ static cl::opt cl::desc("Max number of promotions for a single indirect " "call callsite")); +cl::opt MaxNumVTableAnnotations("icp-max-num-vtables", cl::init(6), cl::Hidden, + cl::desc("Max number of vtables annotated for a vtable load instruction.")); + ICallPromotionAnalysis::ICallPromotionAnalysis() { ValueDataArray = std::make_unique(MaxNumPromotions); } diff --git a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp index fc8c31de0f4501..0f0085025cc56b 100644 --- a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp +++ b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp @@ -82,6 +82,8 @@ static cl::opt ModuleSummaryDotFile( extern cl::opt ScalePartialSampleProfileWorkingSetSize; +extern cl::opt MaxNumVTableAnnotations; + // Walk through the operands of a given User via worklist iteration and populate // the set of GlobalValue references encountered. Invoked either on an // Instruction or a GlobalVariable (which walks its initializer). @@ -129,14 +131,10 @@ static bool findRefEdges(ModuleSummaryIndex &Index, const User *CurUser, if (I) { uint32_t ActualNumValueData = 0; uint64_t TotalCount = 0; -// 24 is the maximum number of values preserved for one instrumented site, -// defined by INSTR_PROF_DEFAULT_NUM_VAL_PER_SITE in -// compiler-rt/lib/profile/InstrProfilingValue.c; passing 24 as -// `MaxNumValueData` controls the max number of elements in the returned -// array. The actual number of values is gated by the number of ops in !prof -// metadata. +// MaxNumVTableAnnotations is the maximum number of vtables annotated on +// the instruction. auto ValueDataArray = getValueProfDataFromInst( -*I, IPVK_VTableTarget, 24 /* MaxNumValueData */, ActualNumValueData, +*I, IPVK_VTableTarget, MaxNumVTableAnnotations /* MaxNumValueData */, ActualNumValueData, TotalCount); if (ValueDataArray.get()) { diff --git a/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll b/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll index 28e4b1d19aef72..ba3ce9a75ee832 100644 --- a/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll +++ b/llvm/test/Bitcode/thinlto-func-summary-vtableref-pgo.ll @@ -1,4 +1,8 @@ -; RUN: opt -module-summary %s -o %t.o +; Promote at most one function and annotate at most one vtable. +; As a result, only one value (of each relevant kind) shows up in the function +; summary. + +; RUN: opt -module-summary -icp-max-num-vtables=1 -icp-max-prom=1 %s -o %t.o ; RUN: llvm-bcanalyzer -dump %t.o | FileCheck %s @@ -11,15 +15,17 @@ ; CHECK-NEXT: ; The `VALUE_GUID` below represents the "_ZTV4Base" referenced by the instruction ; that loads vtable pointers. -; CHECK-NEXT: +; CHECK-NEXT: ; The `VALUE_GUID` below represents the "_ZN4Base4funcEv" referenced by the ; indirect call instruction. -; CHECK-NEXT: +; CHECK-NEXT: +; NOTE vtables and functions from Derived class is dropped because +; `-icp-max-num-vtables` and `-icp-max-prom` are both set to one. ; has the format [valueid, flags, instcount, funcflags, ; numrefs, rorefcnt, worefcnt, ; m x valueid, ; n x (valueid, hotness+tailcall)] -; CHECK-NEXT: +; CHECK-NEXT: ; CHECK-NEXT: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" @@ -36,7 +42,6 @@ define i32 @_Z4testP4Base(ptr %0) !prof !15 { !llvm.module.flags = !{!1} - !1 = !{i32 1, !"ProfileSummary", !2} !2 = !{!3, !4, !5, !6, !7, !8, !9, !10} !3 = !{!"ProfileFormat", !"InstrProf"} @@ -53,10 +58,12 @@ define i32 @_Z4testP4Base(ptr %0) !prof !15 { !14 = !{i32 99, i64 1, i32 2} !15 = !{!"function_entry_count", i32 150} -; 1960855528937986108 is the MD5 hash of _ZTV4Base -!16 = !{!"VP", i32 2, i64 1600, i64 1960855528937986108, i64 1600} -; 5459407273543877811 is the MD5 hash of _ZN4Base4funcEv -!17 = !{!"VP", i32 0, i64 1600, i64 5459407273543877811
[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)
https://github.com/yozhu approved this pull request. https://github.com/llvm/llvm-project/pull/79641 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (PR #79642)
https://github.com/tstellar milestoned https://github.com/llvm/llvm-project/pull/79642 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) (PR #79642)
https://github.com/dyung created https://github.com/llvm/llvm-project/pull/79642 This test started to fail when LLVM created the release/18.x branch and the main branch subsequently had the version number increased from 18 to 19. I investigated this failure (it was blocking our internal automation) and discovered that the CHECK statement on line 27 seemed to have the compiler version number (1800) encoded in octal that it was checking for. I don't know if this is something that explicitly needs to be checked, so I am leaving it in, but it should be more flexible so the test doesn't fail anytime the version number is changed. To accomplish that, I changed the check for the 4-digit version number to be a regex. I originally updated this test for the 18->19 transition in a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK line more flexible so it doesn't need to be continually updated. (cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874) >From d787fef6a16dfc7e3b06bec5fc4c9e2d22180d5a Mon Sep 17 00:00:00 2001 From: dyung Date: Fri, 26 Jan 2024 09:36:20 -0800 Subject: [PATCH] Change check for embedded llvm version number to a regex to make test more flexible. (#79528) This test started to fail when LLVM created the release/18.x branch and the main branch subsequently had the version number increased from 18 to 19. I investigated this failure (it was blocking our internal automation) and discovered that the CHECK statement on line 27 seemed to have the compiler version number (1800) encoded in octal that it was checking for. I don't know if this is something that explicitly needs to be checked, so I am leaving it in, but it should be more flexible so the test doesn't fail anytime the version number is changed. To accomplish that, I changed the check for the 4-digit version number to be a regex. I originally updated this test for the 18->19 transition in a01195ff5cc3d7fd084743b1f47007645bb385f4. This change makes the CHECK line more flexible so it doesn't need to be continually updated. (cherry picked from commit 45f883ed06f39fba7557dfbbff4d10595b45f874) --- llvm/test/CodeGen/SystemZ/zos-ppa2.ll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll index f54f654b804a239..60580aeb6d83cc7 100644 --- a/llvm/test/CodeGen/SystemZ/zos-ppa2.ll +++ b/llvm/test/CodeGen/SystemZ/zos-ppa2.ll @@ -24,7 +24,7 @@ ; CHECK:.byte 0 ; CHECK:.byte 3 ; CHECK:.short 30 -; CHECK:.ascii "\323\323\345\324@@\361\370\360\360\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360" +; CHECK:.ascii "\323\323\345\324@@{{((\\3[0-7]{2}){4})}}\361\371\367\360\360\361\360\361\360\360\360\360\360\360\360\360" define void @void_test() { entry: ret void ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)
llvmbot wrote: @llvm/pr-subscribers-backend-arm Author: Oskar Wirga (oskarwirga) Changes I would like to backport these changes to the 18.x branch because that is where stack-clash-protection was first introduced and this patch fixes a subtle register allocation bug that occurs due to incorrect ordering of recomputeLiveIns. Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. This PR fixes that by simply recomputing the liveins for the entire CFG until convergence is achieved. This makes it harder to introduce subtle bugs which alter liveness. (cherry picked from commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b) --- Patch is 297.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79641.diff 30 Files Affected: - (modified) llvm/include/llvm/CodeGen/LivePhysRegs.h (+23-2) - (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+6) - (modified) llvm/lib/CodeGen/BranchFolding.cpp (+1-2) - (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+1-2) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-3) - (modified) llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (+1-6) - (modified) llvm/lib/Target/PowerPC/PPCExpandAtomicPseudoInsts.cpp (+2-5) - (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+2-4) - (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+2-4) - (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+2-6) - (modified) llvm/test/CodeGen/AArch64/stack-probing-last-in-block.mir (+1-1) - (modified) llvm/test/CodeGen/SystemZ/branch-folder-hoist-livein.mir (+25-11) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-default.mir (+102-90) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize-strd-lr.mir (+89-78) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize.mir (+96-87) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-mov.mir (+87-74) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-chain.mir (+138-120) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-prev-iteration.mir (+155-130) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-liveout.mir (+154-129) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix-debug.mir (+80-70) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix.mir (+173-144) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dlstp.mir (+58-56) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multi-block-cond-iter-count.mir (+132-112) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiple-do-loops.mir (+233-193) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/predicated-liveout.mir (+38-29) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/remove-elem-moves.mir (+97-79) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-debug.mir (+65-56) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir (+71-68) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/spillingmove.mir (+7-7) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unrolled-and-vector.mir (+154-130) ``diff diff --git a/llvm/include/llvm/CodeGen/LivePhysRegs.h b/llvm/include/llvm/CodeGen/LivePhysRegs.h index 76bb34d270a26dc..1e0ee9eb9eb3203 100644 --- a/llvm/include/llvm/CodeGen/LivePhysRegs.h +++ b/llvm/include/llvm/CodeGen/LivePhysRegs.h @@ -31,6 +31,7 @@ #include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" +#include "llvm/CodeGen/MachineFunction.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/MC/MCRegister.h" #include "llvm/MC/MCRegisterInfo.h" @@ -193,11 +194,31 @@ void addLiveIns(MachineBasicBlock &MBB, const LivePhysRegs &LiveRegs); void computeAndAddLiveIns(LivePhysRegs &LiveRegs, MachineBasicBlock &MBB); -/// Convenience function for recomputing live-in's for \p MBB. -static inline void recomputeLiveIns(MachineBasicBlock &MBB) { +/// Function to update the live-in's for a basic block and return whether any +/// changes were made. +static inline bool updateBlockLiveInfo(MachineBasicBlock &MBB) { LivePhysRegs LPR; + auto oldLiveIns = MBB.getLiveIns(); + MBB.clearLiveIns(); computeAndAddLiveIns(LPR, MBB); + MBB.sortUniqueLiveIns(); + + auto newLiveIns = MBB.getLiveIns(); + return oldLiveIns != newLiveIns; +} + +/// Convenience function for recomputing live-in's for the entire CFG until +/// convergence is reached. +static inline void recomputeLiveIns(MachineFunction &MF) { + bool anyChanged; + do { +anyChanged = false; +for (auto MFI = MF.rbegin(), MFE = MF.rend(); MFI != MFE; ++MFI) { + MachineBasicBlock &MBB = *MFI; + anyChanged |= updateBlockLive
[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: Oskar Wirga (oskarwirga) Changes I would like to backport these changes to the 18.x branch because that is where stack-clash-protection was first introduced and this patch fixes a subtle register allocation bug that occurs due to incorrect ordering of recomputeLiveIns. Currently, the way that recomputeLiveIns works is that it will recompute the livein registers for that MachineBasicBlock but it matters what order you call recomputeLiveIn which can result in incorrect register allocations down the line. This PR fixes that by simply recomputing the liveins for the entire CFG until convergence is achieved. This makes it harder to introduce subtle bugs which alter liveness. (cherry picked from commit 59bf60519fc30d9d36c86abd83093b068f6b1e4b) --- Patch is 297.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79641.diff 30 Files Affected: - (modified) llvm/include/llvm/CodeGen/LivePhysRegs.h (+23-2) - (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+6) - (modified) llvm/lib/CodeGen/BranchFolding.cpp (+1-2) - (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+1-2) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-3) - (modified) llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp (+1-6) - (modified) llvm/lib/Target/PowerPC/PPCExpandAtomicPseudoInsts.cpp (+2-5) - (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+2-4) - (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+2-4) - (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+2-6) - (modified) llvm/test/CodeGen/AArch64/stack-probing-last-in-block.mir (+1-1) - (modified) llvm/test/CodeGen/SystemZ/branch-folder-hoist-livein.mir (+25-11) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-default.mir (+102-90) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize-strd-lr.mir (+89-78) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/biquad-cascade-optsize.mir (+96-87) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/it-block-mov.mir (+87-74) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-chain.mir (+138-120) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-copy-prev-iteration.mir (+155-130) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/loop-dec-liveout.mir (+154-129) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix-debug.mir (+80-70) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/matrix.mir (+173-144) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mov-after-dlstp.mir (+58-56) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multi-block-cond-iter-count.mir (+132-112) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/multiple-do-loops.mir (+233-193) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/predicated-liveout.mir (+38-29) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/remove-elem-moves.mir (+97-79) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-debug.mir (+65-56) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir (+71-68) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/spillingmove.mir (+7-7) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/unrolled-and-vector.mir (+154-130) ``diff diff --git a/llvm/include/llvm/CodeGen/LivePhysRegs.h b/llvm/include/llvm/CodeGen/LivePhysRegs.h index 76bb34d270a26dc..1e0ee9eb9eb3203 100644 --- a/llvm/include/llvm/CodeGen/LivePhysRegs.h +++ b/llvm/include/llvm/CodeGen/LivePhysRegs.h @@ -31,6 +31,7 @@ #include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" +#include "llvm/CodeGen/MachineFunction.h" #include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/MC/MCRegister.h" #include "llvm/MC/MCRegisterInfo.h" @@ -193,11 +194,31 @@ void addLiveIns(MachineBasicBlock &MBB, const LivePhysRegs &LiveRegs); void computeAndAddLiveIns(LivePhysRegs &LiveRegs, MachineBasicBlock &MBB); -/// Convenience function for recomputing live-in's for \p MBB. -static inline void recomputeLiveIns(MachineBasicBlock &MBB) { +/// Function to update the live-in's for a basic block and return whether any +/// changes were made. +static inline bool updateBlockLiveInfo(MachineBasicBlock &MBB) { LivePhysRegs LPR; + auto oldLiveIns = MBB.getLiveIns(); + MBB.clearLiveIns(); computeAndAddLiveIns(LPR, MBB); + MBB.sortUniqueLiveIns(); + + auto newLiveIns = MBB.getLiveIns(); + return oldLiveIns != newLiveIns; +} + +/// Convenience function for recomputing live-in's for the entire CFG until +/// convergence is reached. +static inline void recomputeLiveIns(MachineFunction &MF) { + bool anyChanged; + do { +anyChanged = false; +for (auto MFI = MF.rbegin(), MFE = MF.rend(); MFI != MFE; ++MFI) { + MachineBasicBlock &MBB = *MFI; + anyChanged |= updateBlock
[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)
https://github.com/oskarwirga edited https://github.com/llvm/llvm-project/pull/79641 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)
https://github.com/oskarwirga milestoned https://github.com/llvm/llvm-project/pull/79641 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [lld] [RISCV] Support RISC-V TLSDESC in LLD (PR #77516)
https://github.com/ilovepi closed https://github.com/llvm/llvm-project/pull/77516 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [lld] [RISCV] Support RISC-V TLSDESC in LLD (PR #77516)
ilovepi wrote: Abandon in favor of https://github.com/llvm/llvm-project/pull/79239 https://github.com/llvm/llvm-project/pull/77516 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)
llvmbot wrote: @llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-driver Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79632 --- Full diff: https://github.com/llvm/llvm-project/pull/79633.diff 9 Files Affected: - (modified) clang/include/clang/Basic/CodeGenOptions.def (+3) - (modified) clang/include/clang/Driver/Options.td (+5) - (modified) clang/lib/CodeGen/BackendUtil.cpp (+1) - (modified) clang/lib/Driver/ToolChains/Clang.cpp (+3) - (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+30) - (modified) clang/lib/Driver/ToolChains/CommonArgs.h (+3) - (added) clang/test/CodeGen/RISCV/tls-dialect.c (+14) - (added) clang/test/Driver/tls-dialect.c (+25) - (modified) llvm/include/llvm/TargetParser/Triple.h (+3-3) ``diff diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 2f2e45d5cf63dfa..7c0bfe328496147 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibr /// The default TLS model to use. ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel) +/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this value. +CODEGENOPT(EnableTLSDESC, 1, 0) + /// Bit size of immediate TLS offsets (0 == use the default). VALUE_CODEGENOPT(TLSSize, 8, 0) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 7f4fa33748facaf..773bc1dcda01d5c 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, Group, HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): " "12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 256TB, needs -mcmodel=large)">, MarshallingInfoInt>; +def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group, + Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use for dynamic accesses of TLS variables">; def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group; def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, Group; def mno_default_build_attributes : Joined<["-"], "mno-default-build-attributes">, Group; @@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], "fexperimental-assignme Values<"disabled,enabled,forced">, NormalizedValues<["Disabled","Enabled","Forced"]>, MarshallingInfoEnum, "Enabled">; +def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">, + MarshallingInfoFlag>; + } // let Visibility = [CC1Option] //===--===// diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index ec203f6f28bc173..7877e20d77f7724 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags, Options.UniqueBasicBlockSectionNames = CodeGenOpts.UniqueBasicBlockSectionNames; Options.TLSSize = CodeGenOpts.TLSSize; + Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC; Options.EmulatedTLS = CodeGenOpts.EmulatedTLS; Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning(); Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection; diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 5dc614e11aab599..8092fc050b0ee6d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ); } + if (isTLSDESCEnabled(TC, Args)) +CmdArgs.push_back("-enable-tlsdesc"); + // Add the target cpu std::string CPU = getCPUName(D, Args, Triple, /*FromAs*/ false); if (!CPU.empty()) { diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index fadaf3e60c6616a..ff4047298d70d52 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -729,6 +729,33 @@ bool tools::isUseSeparateSections(const llvm::Triple &Triple) { return Triple.isPS(); } +bool tools::isTLSDESCEnabled(const ToolChain &TC, + const llvm::opt::ArgList &Args) { + const llvm::Triple &Triple = TC.getEffectiveTriple(); + Arg *A = Args.getLastArg(options::OPT_mtls_dialect_EQ); + if (!A) +return Triple.hasDefaultTLSDESC(); + StringRef V = A->getValue(); + bool SupportedArgument = false, EnableTLSDESC = false; + bool Unsupported = !Triple.isOSBinFormatELF(); + if (Triple.isRISCV()) { +SupportedArgument = V == "desc" || V == "trad"; +EnableTLSDESC = V == "desc"; + } else
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)
https://github.com/llvmbot updated https://github.com/llvm/llvm-project/pull/79633 >From b373a525347ab112ae4b73e747529b48b898d394 Mon Sep 17 00:00:00 2001 From: Fangrui Song Date: Fri, 26 Jan 2024 09:25:38 -0800 Subject: [PATCH] [Driver,CodeGen] Support -mtls-dialect= (#79256) GCC supports -mtls-dialect= for several architectures to select TLSDESC. This patch supports the following values * x86: "gnu". "gnu2" (TLSDESC) is not supported yet. * RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915) AArch64 toolchains seem to support TLSDESC from the beginning, and the general dynamic model has poor support. Nobody seems to use the option -mtls-dialect= at all, so we don't bother with it. There also seems very little interest in AArch32's TLSDESC support. TLSDESC does not change IR, but affects object file generation. Without a backend option the option is a no-op for in-process ThinLTO. There seems no motivation to have fine-grained control mixing trad/desc for TLS, so we just pass -mllvm, and don't bother with a modules flag metadata or function attribute. Co-authored-by: Paul Kirth (cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20) --- clang/include/clang/Basic/CodeGenOptions.def | 3 ++ clang/include/clang/Driver/Options.td| 5 clang/lib/CodeGen/BackendUtil.cpp| 1 + clang/lib/Driver/ToolChains/Clang.cpp| 3 ++ clang/lib/Driver/ToolChains/CommonArgs.cpp | 30 clang/lib/Driver/ToolChains/CommonArgs.h | 3 ++ clang/test/CodeGen/RISCV/tls-dialect.c | 14 + clang/test/Driver/tls-dialect.c | 25 llvm/include/llvm/TargetParser/Triple.h | 6 ++-- 9 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/RISCV/tls-dialect.c create mode 100644 clang/test/Driver/tls-dialect.c diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 2f2e45d5cf63dfa..7c0bfe328496147 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibr /// The default TLS model to use. ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel) +/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this value. +CODEGENOPT(EnableTLSDESC, 1, 0) + /// Bit size of immediate TLS offsets (0 == use the default). VALUE_CODEGENOPT(TLSSize, 8, 0) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 7f4fa33748facaf..773bc1dcda01d5c 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, Group, HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): " "12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 256TB, needs -mcmodel=large)">, MarshallingInfoInt>; +def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group, + Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use for dynamic accesses of TLS variables">; def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group; def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, Group; def mno_default_build_attributes : Joined<["-"], "mno-default-build-attributes">, Group; @@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], "fexperimental-assignme Values<"disabled,enabled,forced">, NormalizedValues<["Disabled","Enabled","Forced"]>, MarshallingInfoEnum, "Enabled">; +def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">, + MarshallingInfoFlag>; + } // let Visibility = [CC1Option] //===--===// diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index ec203f6f28bc173..7877e20d77f7724 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags, Options.UniqueBasicBlockSectionNames = CodeGenOpts.UniqueBasicBlockSectionNames; Options.TLSSize = CodeGenOpts.TLSSize; + Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC; Options.EmulatedTLS = CodeGenOpts.EmulatedTLS; Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning(); Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection; diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 5dc614e11aab599..8092fc050b0ee6d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ); } + if (isTLSDESCEnabled(TC, Args)) +CmdAr
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)
https://github.com/ilovepi approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/79633 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)
github-actions[bot] wrote: @ilovepi What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79633 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79632 (PR #79633)
https://github.com/github-actions[bot] milestoned https://github.com/llvm/llvm-project/pull/79633 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79632 (PR #79633)
https://github.com/github-actions[bot] created https://github.com/llvm/llvm-project/pull/79633 resolves llvm/llvm-project#79632 >From 59c20ec7892a93d96094df91d9aea0c9c5304566 Mon Sep 17 00:00:00 2001 From: Fangrui Song Date: Fri, 26 Jan 2024 09:25:38 -0800 Subject: [PATCH] [Driver,CodeGen] Support -mtls-dialect= (#79256) GCC supports -mtls-dialect= for several architectures to select TLSDESC. This patch supports the following values * x86: "gnu". "gnu2" (TLSDESC) is not supported yet. * RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915) AArch64 toolchains seem to support TLSDESC from the beginning, and the general dynamic model has poor support. Nobody seems to use the option -mtls-dialect= at all, so we don't bother with it. There also seems very little interest in AArch32's TLSDESC support. TLSDESC does not change IR, but affects object file generation. Without a backend option the option is a no-op for in-process ThinLTO. There seems no motivation to have fine-grained control mixing trad/desc for TLS, so we just pass -mllvm, and don't bother with a modules flag metadata or function attribute. Co-authored-by: Paul Kirth (cherry picked from commit 36b4a9ccd9f7e04010476e6b2a311f2052a4ac20) --- clang/include/clang/Basic/CodeGenOptions.def | 3 ++ clang/include/clang/Driver/Options.td| 5 clang/lib/CodeGen/BackendUtil.cpp| 1 + clang/lib/Driver/ToolChains/Clang.cpp| 3 ++ clang/lib/Driver/ToolChains/CommonArgs.cpp | 30 clang/lib/Driver/ToolChains/CommonArgs.h | 3 ++ clang/test/CodeGen/RISCV/tls-dialect.c | 14 + clang/test/Driver/tls-dialect.c | 25 llvm/include/llvm/TargetParser/Triple.h | 6 ++-- 9 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 clang/test/CodeGen/RISCV/tls-dialect.c create mode 100644 clang/test/Driver/tls-dialect.c diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 2f2e45d5cf63df..7c0bfe32849614 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, llvm::driver::VectorLibr /// The default TLS model to use. ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel) +/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this value. +CODEGENOPT(EnableTLSDESC, 1, 0) + /// Bit size of immediate TLS offsets (0 == use the default). VALUE_CODEGENOPT(TLSSize, 8, 0) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 7f4fa33748faca..773bc1dcda01d5 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, Group, HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): " "12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 256TB, needs -mcmodel=large)">, MarshallingInfoInt>; +def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group, + Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use for dynamic accesses of TLS variables">; def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group; def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, Group; def mno_default_build_attributes : Joined<["-"], "mno-default-build-attributes">, Group; @@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], "fexperimental-assignme Values<"disabled,enabled,forced">, NormalizedValues<["Disabled","Enabled","Forced"]>, MarshallingInfoEnum, "Enabled">; +def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">, + MarshallingInfoFlag>; + } // let Visibility = [CC1Option] //===--===// diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp index ec203f6f28bc17..7877e20d77f772 100644 --- a/clang/lib/CodeGen/BackendUtil.cpp +++ b/clang/lib/CodeGen/BackendUtil.cpp @@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags, Options.UniqueBasicBlockSectionNames = CodeGenOpts.UniqueBasicBlockSectionNames; Options.TLSSize = CodeGenOpts.TLSSize; + Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC; Options.EmulatedTLS = CodeGenOpts.EmulatedTLS; Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning(); Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection; diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 5dc614e11aab59..8092fc050b0ee6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5822,6 +5822,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ); } + if (
[llvm-branch-commits] [openmp] [clang] [lldb] [mlir] [libcxx] [clang-tools-extra] [libc] [flang] [llvm] [BOLT] Write and parse BF/BB hashes in BAT (PR #76907)
https://github.com/aaupov converted_to_draft https://github.com/llvm/llvm-project/pull/76907 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [clang] [lldb] [mlir] [llvm] [libc] [flang] [clang-tools-extra] [openmp] [BOLT] Write and parse BF/BB hashes in BAT (PR #76907)
@@ -36,9 +37,12 @@ void BoltAddressTranslation::writeEntriesForBB(MapTy &Map, if (BBInputOffset == BinaryBasicBlock::INVALID_OFFSET) return; - LLVM_DEBUG(dbgs() << "BB " << BB.getName() << "\n"); - LLVM_DEBUG(dbgs() << " Key: " << Twine::utohexstr(BBOutputOffset) -<< " Val: " << Twine::utohexstr(BBInputOffset) << "\n"); + LLVM_DEBUG(dbgs() << "BB " << BB.getName() << "\n" aaupov wrote: Not in this diff. Hashes will be output to YAML in follow up diffs starting with https://github.com/llvm/llvm-project/pull/76910. https://github.com/llvm/llvm-project/pull/76907 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [clang] Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" in release/18.x (PR #79400)
https://github.com/nikic milestoned https://github.com/llvm/llvm-project/pull/79400 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79420 (PR #79595)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/79595 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
Dinistro wrote: Here we have to wait for the build bots, as they are mandatory. https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
andrey-golubev wrote: LGTM. But surely we need someone with commit access (sigh). CC @Dinistro @zero9178 https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
https://github.com/github-actions[bot] milestoned https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
github-actions[bot] wrote: @andrey-golubev What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
https://github.com/github-actions[bot] created https://github.com/llvm/llvm-project/pull/79603 resolves llvm/llvm-project#79600 >From 7732d51a75a991f3e49d22e2ea6478c40025944d Mon Sep 17 00:00:00 2001 From: Andrei Golubev Date: Fri, 26 Jan 2024 15:27:51 +0200 Subject: [PATCH] [mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562) GEPArg can only be constructed from int32_t and mlir::Value. Explicitly cast other types (e.g. unsigned, size_t) to int32_t to avoid narrowing conversion warnings on MSVC. Some recent examples of such are: ``` mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398: Element '1': conversion from 'size_t' to 'T' requires a narrowing conversion with [ T=mlir::LLVM::GEPArg ] mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398: Element '1': conversion from 'unsigned int' to 'T' requires a narrowing conversion with [ T=mlir::LLVM::GEPArg ] ``` Co-authored-by: Nikita Kudriavtsev (cherry picked from commit 89cd345667a5f8f4c37c621fd8abe8d84e85c050) --- mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp | 3 ++- mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp | 9 + mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp | 9 + 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp index ae2bd8e5b5405d9..73d418cb8413276 100644 --- a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp +++ b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp @@ -529,7 +529,8 @@ LogicalResult GPUPrintfOpToVPrintfLowering::matchAndRewrite( /*alignment=*/0); for (auto [index, arg] : llvm::enumerate(args)) { Value ptr = rewriter.create( -loc, ptrType, structType, tempAlloc, ArrayRef{0, index}); +loc, ptrType, structType, tempAlloc, +ArrayRef{0, static_cast(index)}); rewriter.create(loc, arg, ptr); } std::array printfArgs = {stringStart, tempAlloc}; diff --git a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp index f853d5c47b623cf..78d4e8062468720 100644 --- a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp +++ b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp @@ -1041,13 +1041,14 @@ Value ConvertLaunchFuncOpToGpuRuntimeCallPattern::generateParamsArray( auto arrayPtr = builder.create( loc, llvmPointerType, llvmPointerType, arraySize, /*alignment=*/0); for (const auto &en : llvm::enumerate(arguments)) { +const auto index = static_cast(en.index()); Value fieldPtr = builder.create(loc, llvmPointerType, structType, structPtr, -ArrayRef{0, en.index()}); +ArrayRef{0, index}); builder.create(loc, en.value(), fieldPtr); -auto elementPtr = builder.create( -loc, llvmPointerType, llvmPointerType, arrayPtr, -ArrayRef{en.index()}); +auto elementPtr = +builder.create(loc, llvmPointerType, llvmPointerType, +arrayPtr, ArrayRef{index}); builder.create(loc, fieldPtr, elementPtr); } return arrayPtr; diff --git a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp index 72f9295749a66ba..b25c831bc7172a3 100644 --- a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp +++ b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp @@ -488,7 +488,8 @@ static void splitVectorStore(const DataLayout &dataLayout, Location loc, // Other patterns will turn this into a type-consistent GEP. auto gepOp = rewriter.create( loc, address.getType(), rewriter.getI8Type(), address, -ArrayRef{storeOffset + index * elementSize}); +ArrayRef{ +static_cast(storeOffset + index * elementSize)}); rewriter.create(loc, extractOp, gepOp); } @@ -524,9 +525,9 @@ static void splitIntegerStore(const DataLayout &dataLayout, Location loc, // We create an `i8` indexed GEP here as that is the easiest (offset is // already known). Other patterns turn this into a type-consistent GEP. -auto gepOp = -rewriter.create(loc, address.getType(), rewriter.getI8Type(), - address, ArrayRef{currentOffset}); +auto gepOp = rewriter.create( +loc, address.getType(), rewriter.getI8Type(), address, +ArrayRef{static_cast(currentOffset)}); rewriter.create(loc, valueToStore, gepOp); // No need to care about padding here since we already checked previously ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [libcxx] Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" in release/18.x (PR #79400)
https://github.com/erichkeane approved this pull request. https://github.com/llvm/llvm-project/pull/79400 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [libcxx] Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" in release/18.x (PR #79400)
erichkeane wrote: Author seems to have disappeared, so approving. https://github.com/llvm/llvm-project/pull/79400 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79596)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Nikita Popov (nikic) Changes Resolves #79479. --- Patch is 91.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79596.diff 20 Files Affected: - (modified) clang/docs/ReleaseNotes.rst (+2) - (modified) clang/include/clang/AST/Type.h (+3) - (modified) clang/include/clang/Basic/AttrDocs.td (+4-1) - (modified) clang/lib/AST/ASTContext.cpp (+16-4) - (modified) clang/lib/AST/ItaniumMangle.cpp (+17-8) - (modified) clang/lib/AST/JSONNodeDumper.cpp (+3) - (modified) clang/lib/AST/TextNodeDumper.cpp (+3) - (modified) clang/lib/AST/Type.cpp (+14-1) - (modified) clang/lib/AST/TypePrinter.cpp (+2) - (modified) clang/lib/CodeGen/Targets/RISCV.cpp (+15-6) - (modified) clang/lib/Sema/SemaExpr.cpp (+4-2) - (modified) clang/lib/Sema/SemaType.cpp (+15-6) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-bitcast.c (+100) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-call.c (+74) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-cast.c (+72-4) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-codegen.c (+172) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-globals.c (+107) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-types.c (+284) - (modified) clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp (+72) - (modified) clang/test/Sema/attr-riscv-rvv-vector-bits.c (+86-2) ``diff diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 060bc7669b72a5e..45d1ab34d0f9311 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1227,6 +1227,8 @@ RISC-V Support - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f for RV64. +- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t types. + CUDA/HIP Language Changes ^ diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index ea425791fc97f05..6384cf9420b82e1 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -3495,6 +3495,9 @@ enum class VectorKind { /// is RISC-V RVV fixed-length data vector RVVFixedLengthData, + + /// is RISC-V RVV fixed-length mask vector + RVVFixedLengthMask, }; /// Represents a GCC generic vector type. This type is created using diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 7e633f8e2635a9a..e02a1201e2ad79a 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536. For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the LMUL of the type before passing to the attribute. -``vbool*_t`` types are not supported at this time. +For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the +number from the type name. For example, ``vbool8_t`` needs to use +``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8, +the type is not supported for that value of ``__riscv_v_fixed_vlen``. }]; } diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 5eb7aa3664569dd..ab16ca10395fa83 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate) // Adjust the alignment for fixed-length SVE predicates. Align = 16; -else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData) +else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData || + VT->getVectorKind() == VectorKind::RVVFixedLengthMask) // Adjust the alignment for fixed-length RVV vectors. Align = std::min(64, Width); break; @@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType FirstVec, Second->getVectorKind() != VectorKind::SveFixedLengthData && Second->getVectorKind() != VectorKind::SveFixedLengthPredicate && First->getVectorKind() != VectorKind::RVVFixedLengthData && - Second->getVectorKind() != VectorKind::RVVFixedLengthData) + Second->getVectorKind() != VectorKind::RVVFixedLengthData && + First->getVectorKind() != VectorKind::RVVFixedLengthMask && + Second->getVectorKind() != VectorKind::RVVFixedLengthMask) return true; return false; @@ -9522,8 +9525,11 @@ static uint64_t getRVVTypeSize(ASTContext &Context, const BuiltinType *Ty) { ASTContext::BuiltinVectorTypeInfo Info = Context.getBuiltinVectorTypeInfo(Ty); - uint64_t EltSize = Context.getTypeSize(Info.ElementType); - uint64_t MinElts = Info.EC.getKnownMinValue(); + unsigned EltSize = Context.getTypeSize(Info.ElementType); + if (Info.ElementType == Context.BoolTy) +EltSize = 1; + + unsigned MinElts = Info.EC.getKnownMinValue(); r
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79596)
https://github.com/nikic created https://github.com/llvm/llvm-project/pull/79596 Resolves #79479. >From cc135ed1df0d7573b0474d76e1d47236b30cdf36 Mon Sep 17 00:00:00 2001 From: Craig Topper Date: Thu, 25 Jan 2024 09:39:29 -0800 Subject: [PATCH] Recommit "[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551)" Test updated to expect i8 gep. Original message: This adopts a similar behavior to AArch64 SVE, where bool vectors are represented as a vector of chars with 1/8 the number of elements. This ensures the vector always occupies a power of 2 number of bytes. A consequence of this is that vbool64_t, vbool32_t, and vool16_t can only be used with a vector length that guarantees at least 8 bits. (cherry picked from commit c92ad411f2f94d8521cd18abcb37285f9a390ecb) --- clang/docs/ReleaseNotes.rst | 2 + clang/include/clang/AST/Type.h| 3 + clang/include/clang/Basic/AttrDocs.td | 5 +- clang/lib/AST/ASTContext.cpp | 20 +- clang/lib/AST/ItaniumMangle.cpp | 25 +- clang/lib/AST/JSONNodeDumper.cpp | 3 + clang/lib/AST/TextNodeDumper.cpp | 3 + clang/lib/AST/Type.cpp| 15 +- clang/lib/AST/TypePrinter.cpp | 2 + clang/lib/CodeGen/Targets/RISCV.cpp | 21 +- clang/lib/Sema/SemaExpr.cpp | 6 +- clang/lib/Sema/SemaType.cpp | 21 +- .../attr-riscv-rvv-vector-bits-bitcast.c | 100 ++ .../CodeGen/attr-riscv-rvv-vector-bits-call.c | 74 + .../CodeGen/attr-riscv-rvv-vector-bits-cast.c | 76 - .../attr-riscv-rvv-vector-bits-codegen.c | 172 +++ .../attr-riscv-rvv-vector-bits-globals.c | 107 +++ .../attr-riscv-rvv-vector-bits-types.c| 284 ++ .../riscv-mangle-rvv-fixed-vectors.cpp| 72 + clang/test/Sema/attr-riscv-rvv-vector-bits.c | 88 +- 20 files changed, 1065 insertions(+), 34 deletions(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 060bc7669b72a5e..45d1ab34d0f9311 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1227,6 +1227,8 @@ RISC-V Support - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f for RV64. +- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t types. + CUDA/HIP Language Changes ^ diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index ea425791fc97f05..6384cf9420b82e1 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -3495,6 +3495,9 @@ enum class VectorKind { /// is RISC-V RVV fixed-length data vector RVVFixedLengthData, + + /// is RISC-V RVV fixed-length mask vector + RVVFixedLengthMask, }; /// Represents a GCC generic vector type. This type is created using diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 7e633f8e2635a9a..e02a1201e2ad79a 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536. For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the LMUL of the type before passing to the attribute. -``vbool*_t`` types are not supported at this time. +For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the +number from the type name. For example, ``vbool8_t`` needs to use +``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8, +the type is not supported for that value of ``__riscv_v_fixed_vlen``. }]; } diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 5eb7aa3664569dd..ab16ca10395fa83 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate) // Adjust the alignment for fixed-length SVE predicates. Align = 16; -else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData) +else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData || + VT->getVectorKind() == VectorKind::RVVFixedLengthMask) // Adjust the alignment for fixed-length RVV vectors. Align = std::min(64, Width); break; @@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType FirstVec, Second->getVectorKind() != VectorKind::SveFixedLengthData && Second->getVectorKind() != VectorKind::SveFixedLengthPredicate && First->getVectorKind() != VectorKind::RVVFixedLengthData && - Second->getVectorKind() != VectorKind::RVVFixedLengthData) + Second->getVectorKind() != VectorKind::RVVFixedLengthData && + First->getVectorKind() != VectorKind::RVVFixedLengthMask && + Second->getVectorKind() != VectorKind::RVVFix
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79420 (PR #79595)
llvmbot wrote: @llvm/pr-subscribers-llvm-selectiondag Author: Nikita Popov (nikic) Changes Resolves #79420. --- Full diff: https://github.com/llvm/llvm-project/pull/79595.diff 2 Files Affected: - (modified) llvm/test/TableGen/address-space-patfrags.td (+2-2) - (modified) llvm/utils/TableGen/DAGISelMatcherEmitter.cpp (+2-1) ``diff diff --git a/llvm/test/TableGen/address-space-patfrags.td b/llvm/test/TableGen/address-space-patfrags.td index 4aec6ea7e0eae86..46050a70720fbe1 100644 --- a/llvm/test/TableGen/address-space-patfrags.td +++ b/llvm/test/TableGen/address-space-patfrags.td @@ -46,7 +46,7 @@ def inst_d : Instruction { let InOperandList = (ins GPR32:$src0, GPR32:$src1); } -// SDAG: case 1: { +// SDAG: case 0: { // SDAG-NEXT: // Predicate_pat_frag_b // SDAG-NEXT: // Predicate_truncstorei16_addrspace // SDAG-NEXT: SDNode *N = Node; @@ -69,7 +69,7 @@ def : Pat < >; -// SDAG: case 6: { +// SDAG: case 4: { // SDAG: // Predicate_pat_frag_a // SDAG-NEXT: SDNode *N = Node; // SDAG-NEXT: (void)N; diff --git a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp index 455183987b7b27b..50156d34528c153 100644 --- a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp +++ b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp @@ -57,7 +57,8 @@ class MatcherTableEmitter { // We de-duplicate the predicates by code string, and use this map to track // all the patterns with "identical" predicates. - StringMap> NodePredicatesByCodeToRun; + MapVector, StringMap> + NodePredicatesByCodeToRun; std::vector PatternPredicates; `` https://github.com/llvm/llvm-project/pull/79595 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79420 (PR #79595)
https://github.com/nikic created https://github.com/llvm/llvm-project/pull/79595 Resolves #79420. >From fb66a8484904a1e585c0e54553c1c8b5e5d13dd2 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Thu, 25 Jan 2024 16:16:19 +0800 Subject: [PATCH] [TableGen] Use MapVector to remove non-determinism This fixes found non-determinism when `LLVM_REVERSE_ITERATION` option is `ON`. Fixes #79420. Reviewers: ilovepi, MaskRay Reviewed By: MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/79411 (cherry picked from commit 41fe98a6e7e5cdcab4a4e9e0d09339231f480c01) --- llvm/test/TableGen/address-space-patfrags.td | 4 ++-- llvm/utils/TableGen/DAGISelMatcherEmitter.cpp | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/llvm/test/TableGen/address-space-patfrags.td b/llvm/test/TableGen/address-space-patfrags.td index 4aec6ea7e0eae86..46050a70720fbe1 100644 --- a/llvm/test/TableGen/address-space-patfrags.td +++ b/llvm/test/TableGen/address-space-patfrags.td @@ -46,7 +46,7 @@ def inst_d : Instruction { let InOperandList = (ins GPR32:$src0, GPR32:$src1); } -// SDAG: case 1: { +// SDAG: case 0: { // SDAG-NEXT: // Predicate_pat_frag_b // SDAG-NEXT: // Predicate_truncstorei16_addrspace // SDAG-NEXT: SDNode *N = Node; @@ -69,7 +69,7 @@ def : Pat < >; -// SDAG: case 6: { +// SDAG: case 4: { // SDAG: // Predicate_pat_frag_a // SDAG-NEXT: SDNode *N = Node; // SDAG-NEXT: (void)N; diff --git a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp index 455183987b7b27b..50156d34528c153 100644 --- a/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp +++ b/llvm/utils/TableGen/DAGISelMatcherEmitter.cpp @@ -57,7 +57,8 @@ class MatcherTableEmitter { // We de-duplicate the predicates by code string, and use this map to track // all the patterns with "identical" predicates. - StringMap> NodePredicatesByCodeToRun; + MapVector, StringMap> + NodePredicatesByCodeToRun; std::vector PatternPredicates; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [mlir] [llvm] PR for llvm/llvm-project#79293 (PR #79461)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/79461 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)
https://github.com/github-actions[bot] created https://github.com/llvm/llvm-project/pull/79572 resolves llvm/llvm-project#79571 >From 14afe95f8f9bdfb5474445e5930e38afe1ba9f60 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Wed, 24 Jan 2024 10:15:42 +0100 Subject: [PATCH] [MSSAUpdater] Handle simplified accesses when updating phis (#78272) This is a followup to #76819. After those changes, we can still run into an assertion failure for a slight variation of the test case: When fixing up MemoryPhis, we map the incoming access to the access of the cloned instruction -- which may now no longer exist. Fix this by reusing the getNewDefiningAccessForClone() helper, which will look upwards for a new defining access in that case. (cherry picked from commit a7a1b8b17e264fb0f2d2b4165cf9a7f5094b08b3) --- llvm/lib/Analysis/MemorySSAUpdater.cpp| 22 +--- .../memssa-readnone-access.ll | 104 ++ 2 files changed, 107 insertions(+), 19 deletions(-) diff --git a/llvm/lib/Analysis/MemorySSAUpdater.cpp b/llvm/lib/Analysis/MemorySSAUpdater.cpp index e87ae7d71fffe20..aa550f0b6a7bfd6 100644 --- a/llvm/lib/Analysis/MemorySSAUpdater.cpp +++ b/llvm/lib/Analysis/MemorySSAUpdater.cpp @@ -692,25 +692,9 @@ void MemorySSAUpdater::updateForClonedLoop(const LoopBlocksRPO &LoopBlocks, continue; // Determine incoming value and add it as incoming from IncBB. - if (MemoryUseOrDef *IncMUD = dyn_cast(IncomingAccess)) { -if (!MSSA->isLiveOnEntryDef(IncMUD)) { - Instruction *IncI = IncMUD->getMemoryInst(); - assert(IncI && "Found MemoryUseOrDef with no Instruction."); - if (Instruction *NewIncI = - cast_or_null(VMap.lookup(IncI))) { -IncMUD = MSSA->getMemoryAccess(NewIncI); -assert(IncMUD && - "MemoryUseOrDef cannot be null, all preds processed."); - } -} -NewPhi->addIncoming(IncMUD, IncBB); - } else { -MemoryPhi *IncPhi = cast(IncomingAccess); -if (MemoryAccess *NewDefPhi = MPhiMap.lookup(IncPhi)) - NewPhi->addIncoming(NewDefPhi, IncBB); -else - NewPhi->addIncoming(IncPhi, IncBB); - } + NewPhi->addIncoming( + getNewDefiningAccessForClone(IncomingAccess, VMap, MPhiMap, MSSA), + IncBB); } if (auto *SingleAccess = onlySingleValue(NewPhi)) { MPhiMap[Phi] = SingleAccess; diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll index 2aaf777683e116f..c6e6608d4be383a 100644 --- a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll +++ b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll @@ -115,3 +115,107 @@ split: exit: ret void } + +; Variants of the above test with swapped branch destinations. + +define void @test1_swapped(i1 %c) { +; CHECK-LABEL: define void @test1_swapped( +; CHECK-SAME: i1 [[C:%.*]]) { +; CHECK-NEXT: start: +; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]] +; CHECK-NEXT:br i1 [[C_FR]], label [[START_SPLIT_US:%.*]], label [[START_SPLIT:%.*]] +; CHECK: start.split.us: +; CHECK-NEXT:br label [[LOOP_US:%.*]] +; CHECK: loop.us: +; CHECK-NEXT:call void @foo() +; CHECK-NEXT:br label [[LOOP_US]] +; CHECK: start.split: +; CHECK-NEXT:br label [[LOOP:%.*]] +; CHECK: loop: +; CHECK-NEXT:call void @foo() +; CHECK-NEXT:br label [[EXIT:%.*]] +; CHECK: exit: +; CHECK-NEXT:ret void +; +start: + br label %loop + +loop: + %fn = load ptr, ptr @vtable, align 8 + call void %fn() + br i1 %c, label %loop, label %exit + +exit: + ret void +} + +define void @test2_swapped(i1 %c, ptr %p) { +; CHECK-LABEL: define void @test2_swapped( +; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) { +; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]] +; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label [[DOTSPLIT:%.*]] +; CHECK: .split.us: +; CHECK-NEXT:br label [[LOOP_US:%.*]] +; CHECK: loop.us: +; CHECK-NEXT:call void @foo() +; CHECK-NEXT:call void @bar() +; CHECK-NEXT:br label [[LOOP_US]] +; CHECK: .split: +; CHECK-NEXT:br label [[LOOP:%.*]] +; CHECK: loop: +; CHECK-NEXT:call void @foo() +; CHECK-NEXT:call void @bar() +; CHECK-NEXT:br label [[EXIT:%.*]] +; CHECK: exit: +; CHECK-NEXT:ret void +; + br label %loop + +loop: + %fn = load ptr, ptr @vtable, align 8 + call void %fn() + call void @bar() + br i1 %c, label %loop, label %exit + +exit: + ret void +} + +define void @test3_swapped(i1 %c, ptr %p) { +; CHECK-LABEL: define void @test3_swapped( +; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) { +; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]] +; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label [[DOTSPLIT:%.*]] +; CHECK: .split.us: +; CHECK-NEXT:br label [[LOOP_US:%.*]] +; CHECK: loop.us: +; CHECK
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)
github-actions[bot] wrote: @alinas What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)
https://github.com/github-actions[bot] milestoned https://github.com/llvm/llvm-project/pull/79572 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] PR for llvm/llvm-project#79293 (PR #79461)
llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79293 --- Patch is 1006.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79461.diff 65 Files Affected: - (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) - (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) - (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) - (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) - (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) - (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) - (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) - (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) - (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) - (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) - (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) - (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) - (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) - (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll (+117-18) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+504) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+459) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+499) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+456) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) - (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) - (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) ``diff diff --git a/clang/includ
[llvm-branch-commits] [llvm] [mlir] [clang] PR for llvm/llvm-project#79293 (PR #79461)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-clang Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79293 --- Patch is 1006.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79461.diff 65 Files Affected: - (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) - (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) - (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) - (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) - (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) - (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) - (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) - (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) - (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) - (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) - (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) - (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) - (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) - (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll (+117-18) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+504) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+459) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+499) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+456) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) - (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) - (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12)
[llvm-branch-commits] [llvm] [mlir] [clang] PR for llvm/llvm-project#79293 (PR #79461)
llvmbot wrote: @llvm/pr-subscribers-mc Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79293 --- Patch is 1006.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79461.diff 65 Files Affected: - (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) - (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) - (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) - (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) - (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) - (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) - (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) - (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) - (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) - (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) - (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) - (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) - (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) - (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll (+117-18) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+504) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+459) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+499) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+456) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) - (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) - (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) ``diff diff --git a/clang/include/clang
[llvm-branch-commits] [mlir] [llvm] [clang] PR for llvm/llvm-project#79293 (PR #79461)
llvmbot wrote: @llvm/pr-subscribers-llvm-globalisel Author: None (github-actions[bot]) Changes resolves llvm/llvm-project#79293 --- Patch is 1006.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79461.diff 65 Files Affected: - (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+62) - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+164-13) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w32.cl (+156) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12-wmma-w64.cl (+155) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32.cl (+135) - (added) clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64.cl (+134) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32-gfx10-err.cl (+1-1) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w32.cl (+8-9) - (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wmma-w64.cl (+8-10) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w32.cl (+107) - (added) cross-project-tests/amdgpu/builtins-amdgcn-gfx12-wmma-w64.cl (+104) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w32.cl (+110) - (added) cross-project-tests/amdgpu/builtins-amdgcn-swmmac-w64.cl (+109) - (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+90-25) - (modified) llvm/lib/Target/AMDGPU/AMDGPUGISel.td (+24) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp (+330) - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h (+10) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+213) - (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+13) - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+23) - (modified) llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp (+16) - (modified) llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td (+16) - (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+105-6) - (modified) llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp (+8) - (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+16-3) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+37) - (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+4) - (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+3) - (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+30) - (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+5) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+8) - (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+11) - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+5) - (modified) llvm/lib/Target/AMDGPU/VOP3PInstructions.td (+494-6) - (modified) llvm/lib/Target/AMDGPU/VOPInstructions.td (+3) - (modified) llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll (+117-18) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+504) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-imm.ll (+519) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+459) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-imm.ll (+430) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/GlobalISel/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll (+499) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-imm.ll (+431) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-iu-modifiers.ll (+309) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-swmmac-index_key.ll (+321) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32.ll (+370) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-f16-f32-matrix-modifiers.ll (+456) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-imm.ll (+373) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-iu-modifiers.ll (+274) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64-swmmac-index_key.ll (+472) - (added) llvm/test/CodeGen/AMDGPU/wmma-gfx12-w64.ll (+333) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w32.mir (+354) - (added) llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx12-w64.mir (+355) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w32.s (+1529) - (added) llvm/test/MC/AMDGPU/gfx12_asm_wmma_w64.s (+1529) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w32.txt (+1628) - (added) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_wmma_w64.txt (+1628) - (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+9-8) - (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+12-12) ``diff diff --git a/clang/
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)
https://github.com/david-arm approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/79566 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)
github-actions[bot] wrote: @david-arm What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79566 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)
https://github.com/github-actions[bot] milestoned https://github.com/llvm/llvm-project/pull/79566 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)
https://github.com/github-actions[bot] created https://github.com/llvm/llvm-project/pull/79566 resolves llvm/llvm-project#79564 >From 27109f3d576c946cde0162ee29f251d2ab2d0ed2 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Thu, 25 Jan 2024 09:29:46 + Subject: [PATCH] [LTO] Fix Veclib flags correctly pass to LTO flags (#78749) Flags `-fveclib=name` were not passed to LTO flags. This pass fixes that by converting the `-fveclib` flags to their relevant names for opt's `-vector-lib=name` flags. For example: `-fveclib=SLEEF` would become `-vector-library=sleefgnuabi` and passed through the `-plugin-opt` flag. (cherry picked from commit 03cf0e9354e7e56ff794e9efb682ed2971bc91ec) --- clang/lib/Driver/ToolChains/CommonArgs.cpp | 22 ++ clang/test/Driver/fveclib.c| 18 ++ 2 files changed, 40 insertions(+) diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp index fadaf3e60c6616a..9f1dddc47e3e053 100644 --- a/clang/lib/Driver/ToolChains/CommonArgs.cpp +++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp @@ -783,6 +783,28 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args, "-generate-arange-section")); } + // Pass vector library arguments to LTO. + Arg *ArgVecLib = Args.getLastArg(options::OPT_fveclib); + if (ArgVecLib && ArgVecLib->getNumValues() == 1) { +// Map the vector library names from clang front-end to opt front-end. The +// values are taken from the TargetLibraryInfo class command line options. +std::optional OptVal = +llvm::StringSwitch>(ArgVecLib->getValue()) +.Case("Accelerate", "Accelerate") +.Case("LIBMVEC", "LIBMVEC-X86") +.Case("MASSV", "MASSV") +.Case("SVML", "SVML") +.Case("SLEEF", "sleefgnuabi") +.Case("Darwin_libsystem_m", "Darwin_libsystem_m") +.Case("ArmPL", "ArmPL") +.Case("none", "none") +.Default(std::nullopt); + +if (OptVal) + CmdArgs.push_back(Args.MakeArgString( + Twine(PluginOptPrefix) + "-vector-library=" + OptVal.value())); + } + // Try to pass driver level flags relevant to LTO code generation down to // the plugin. diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index e2a7619e9b89f7f..8a230284bcdfe4f 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -31,3 +31,21 @@ // RUN: %clang -fveclib=Accelerate %s -nodefaultlibs -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK-NODEFAULTLIBS %s // CHECK-LINK-NODEFAULTLIBS-NOT: "-framework" "Accelerate" + + +/* Verify that the correct vector library is passed to LTO flags. */ + +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck -check-prefix CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" + +// RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck -check-prefix CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" + +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck -check-prefix CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" + +// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck -check-prefix CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" + +// RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck -check-prefix CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)
https://github.com/wangpc-pp approved this pull request. LGTM. (Is this the right approach in current workflow?) https://github.com/llvm/llvm-project/pull/79560 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79137 (PR #79561)
github-actions[bot] wrote: @fhahn What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79561 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79137 (PR #79561)
https://github.com/github-actions[bot] milestoned https://github.com/llvm/llvm-project/pull/79561 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79137 (PR #79561)
https://github.com/github-actions[bot] created https://github.com/llvm/llvm-project/pull/79561 resolves llvm/llvm-project#79137 >From e2490885d05f95af41fda6b9f3adb20826e80483 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Wed, 24 Jan 2024 10:45:20 +0100 Subject: [PATCH 1/2] [PhaseOrdering] Add additional test for #79161 (NFC) (cherry picked from commit 543cf08636f3a3bb55dddba2e8cad787601647ba) --- .../X86/loop-vectorizer-noalias.ll| 147 ++ 1 file changed, 147 insertions(+) create mode 100644 llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll diff --git a/llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll b/llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll new file mode 100644 index 00..846787f721ba7e --- /dev/null +++ b/llvm/test/Transforms/PhaseOrdering/X86/loop-vectorizer-noalias.ll @@ -0,0 +1,147 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4 +; RUN: opt -S -O3 -mtriple=x86_64-unknown-linux-gnu < %s | FileCheck %s + +define internal void @acc(ptr noalias noundef %val, ptr noalias noundef %prev) { +entry: + %0 = load i8, ptr %prev, align 1 + %conv = zext i8 %0 to i32 + %1 = load i8, ptr %val, align 1 + %conv1 = zext i8 %1 to i32 + %add = add nsw i32 %conv1, %conv + %conv2 = trunc i32 %add to i8 + store i8 %conv2, ptr %val, align 1 + ret void +} + +; This loop should not get vectorized. +; FIXME: This is a miscompile. +define void @accsum(ptr noundef %vals, i64 noundef %num) #0 { +; CHECK-LABEL: define void @accsum( +; CHECK-SAME: ptr nocapture noundef [[VALS:%.*]], i64 noundef [[NUM:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CMP1:%.*]] = icmp ugt i64 [[NUM]], 1 +; CHECK-NEXT:br i1 [[CMP1]], label [[ITER_CHECK:%.*]], label [[FOR_END:%.*]] +; CHECK: iter.check: +; CHECK-NEXT:[[TMP0:%.*]] = add i64 [[NUM]], -1 +; CHECK-NEXT:[[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[NUM]], 9 +; CHECK-NEXT:br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY_PREHEADER:%.*]], label [[VECTOR_MAIN_LOOP_ITER_CHECK:%.*]] +; CHECK: vector.main.loop.iter.check: +; CHECK-NEXT:[[MIN_ITERS_CHECK3:%.*]] = icmp ult i64 [[NUM]], 33 +; CHECK-NEXT:br i1 [[MIN_ITERS_CHECK3]], label [[VEC_EPILOG_PH:%.*]], label [[VECTOR_PH:%.*]] +; CHECK: vector.ph: +; CHECK-NEXT:[[N_VEC:%.*]] = and i64 [[TMP0]], -32 +; CHECK-NEXT:br label [[VECTOR_BODY:%.*]] +; CHECK: vector.body: +; CHECK-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ] +; CHECK-NEXT:[[OFFSET_IDX:%.*]] = or disjoint i64 [[INDEX]], 1 +; CHECK-NEXT:[[TMP1:%.*]] = getelementptr inbounds i8, ptr [[VALS]], i64 [[OFFSET_IDX]] +; CHECK-NEXT:[[TMP2:%.*]] = getelementptr i8, ptr [[TMP1]], i64 -1 +; CHECK-NEXT:tail call void @llvm.experimental.noalias.scope.decl(metadata [[META0:![0-9]+]]) +; CHECK-NEXT:tail call void @llvm.experimental.noalias.scope.decl(metadata [[META3:![0-9]+]]) +; CHECK-NEXT:[[TMP3:%.*]] = getelementptr i8, ptr [[TMP1]], i64 15 +; CHECK-NEXT:[[WIDE_LOAD:%.*]] = load <16 x i8>, ptr [[TMP2]], align 1, !alias.scope [[META3]], !noalias [[META0]] +; CHECK-NEXT:[[WIDE_LOAD4:%.*]] = load <16 x i8>, ptr [[TMP3]], align 1, !alias.scope [[META3]], !noalias [[META0]] +; CHECK-NEXT:[[TMP4:%.*]] = getelementptr inbounds i8, ptr [[TMP1]], i64 16 +; CHECK-NEXT:[[WIDE_LOAD5:%.*]] = load <16 x i8>, ptr [[TMP1]], align 1, !alias.scope [[META0]], !noalias [[META3]] +; CHECK-NEXT:[[WIDE_LOAD6:%.*]] = load <16 x i8>, ptr [[TMP4]], align 1, !alias.scope [[META0]], !noalias [[META3]] +; CHECK-NEXT:[[TMP5:%.*]] = add <16 x i8> [[WIDE_LOAD5]], [[WIDE_LOAD]] +; CHECK-NEXT:[[TMP6:%.*]] = add <16 x i8> [[WIDE_LOAD6]], [[WIDE_LOAD4]] +; CHECK-NEXT:store <16 x i8> [[TMP5]], ptr [[TMP1]], align 1, !alias.scope [[META0]], !noalias [[META3]] +; CHECK-NEXT:store <16 x i8> [[TMP6]], ptr [[TMP4]], align 1, !alias.scope [[META0]], !noalias [[META3]] +; CHECK-NEXT:[[INDEX_NEXT]] = add nuw i64 [[INDEX]], 32 +; CHECK-NEXT:[[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]] +; CHECK-NEXT:br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]] +; CHECK: middle.block: +; CHECK-NEXT:[[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]] +; CHECK-NEXT:br i1 [[CMP_N]], label [[FOR_END]], label [[VEC_EPILOG_ITER_CHECK:%.*]] +; CHECK: vec.epilog.iter.check: +; CHECK-NEXT:[[IND_END9:%.*]] = or disjoint i64 [[N_VEC]], 1 +; CHECK-NEXT:[[N_VEC_REMAINING:%.*]] = and i64 [[TMP0]], 24 +; CHECK-NEXT:[[MIN_EPILOG_ITERS_CHECK:%.*]] = icmp eq i64 [[N_VEC_REMAINING]], 0 +; CHECK-NEXT:br i1 [[MIN_EPILOG_ITERS_CHECK]], label [[FOR_BODY_PREHEADER]], label [[VEC_EPILOG_PH]] +; CHECK: vec.epilog.ph: +; CHECK-NEXT:[[VEC_EPILOG_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], [[VEC_EPILOG_
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: Nikita Popov (nikic) Changes Resolves #79425. --- Patch is 24.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79560.diff 12 Files Affected: - (modified) llvm/include/llvm/Target/TargetInstrPredicate.td (+34) - (modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1-1) - (modified) llvm/lib/Target/RISCV/RISCV.td (+6) - (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (-24) - (removed) llvm/lib/Target/RISCV/RISCVMacroFusion.cpp (-210) - (removed) llvm/lib/Target/RISCV/RISCVMacroFusion.h (-28) - (added) llvm/lib/Target/RISCV/RISCVMacroFusion.td (+93) - (modified) llvm/lib/Target/RISCV/RISCVSubtarget.cpp (+6-2) - (modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+3-5) - (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+8-5) - (modified) llvm/utils/TableGen/PredicateExpander.cpp (+34) - (modified) llvm/utils/TableGen/PredicateExpander.h (+4) ``diff diff --git a/llvm/include/llvm/Target/TargetInstrPredicate.td b/llvm/include/llvm/Target/TargetInstrPredicate.td index 82c4c7b23a49b6a..b5419cb9f3867f0 100644 --- a/llvm/include/llvm/Target/TargetInstrPredicate.td +++ b/llvm/include/llvm/Target/TargetInstrPredicate.td @@ -152,6 +152,34 @@ class CheckImmOperand_s : CheckOperandBase { string ImmVal = Value; } +// Check that the operand at position `Index` is less than `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandLT : CheckOperandBase { + int ImmVal = Imm; +} + +// Check that the operand at position `Index` is greater than `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandGT : CheckOperandBase { + int ImmVal = Imm; +} + +// Check that the operand at position `Index` is less than or equal to `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandLE : CheckNot>; + +// Check that the operand at position `Index` is greater than or equal to `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandGE : CheckNot>; + // Expands to a call to `FunctionMapper` if field `FunctionMapper` is set. // Otherwise, it expands to a CheckNot>. class CheckRegOperandSimple : CheckOperandBase; @@ -203,6 +231,12 @@ class CheckAll Sequence> class CheckAny Sequence> : CheckPredicateSequence; +// Check that the operand at position `Index` is in range [Start, End]. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against range [Start, End]. +class CheckImmOperandRange + : CheckAll<[CheckImmOperandGE, CheckImmOperandLE]>; // Used to expand the body of a function predicate. See the definition of // TIIPredicate below. diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt index a0c3345ec1bbd7e..ac88cd49db4e4ba 100644 --- a/llvm/lib/Target/RISCV/CMakeLists.txt +++ b/llvm/lib/Target/RISCV/CMakeLists.txt @@ -5,6 +5,7 @@ set(LLVM_TARGET_DEFINITIONS RISCV.td) tablegen(LLVM RISCVGenAsmMatcher.inc -gen-asm-matcher) tablegen(LLVM RISCVGenAsmWriter.inc -gen-asm-writer) tablegen(LLVM RISCVGenCompressInstEmitter.inc -gen-compress-inst-emitter) +tablegen(LLVM RISCVGenMacroFusion.inc -gen-macro-fusion-pred) tablegen(LLVM RISCVGenDAGISel.inc -gen-dag-isel) tablegen(LLVM RISCVGenDisassemblerTables.inc -gen-disassembler) tablegen(LLVM RISCVGenInstrInfo.inc -gen-instr-info) @@ -43,7 +44,6 @@ add_llvm_target(RISCVCodeGen RISCVISelDAGToDAG.cpp RISCVISelLowering.cpp RISCVMachineFunctionInfo.cpp - RISCVMacroFusion.cpp RISCVMergeBaseOffset.cpp RISCVOptWInstrs.cpp RISCVPostRAExpandPseudoInsts.cpp diff --git a/llvm/lib/Target/RISCV/RISCV.td b/llvm/lib/Target/RISCV/RISCV.td index e6e879282241dde..27d52c16a4f39d9 100644 --- a/llvm/lib/Target/RISCV/RISCV.td +++ b/llvm/lib/Target/RISCV/RISCV.td @@ -30,6 +30,12 @@ include "RISCVCallingConv.td" include "RISCVInstrInfo.td" include "GISel/RISCVRegisterBanks.td" +//===--===// +// RISC-V macro fusions. +//===--===// + +include "RISCVMacroFusion.td" + //===--===// // RISC-V Scheduling Models //===--==
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)
https://github.com/nikic created https://github.com/llvm/llvm-project/pull/79560 Resolves #79425. >From 2c3d2c996c0b1fb929d903768612c47393528cd3 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Thu, 25 Jan 2024 15:17:31 +0800 Subject: [PATCH 1/2] [TableGen] Add predicates for immediates comparison (#76004) These predicates can be used to represent `<`, `<=`, `>`, `>=`. And a predicate for `in range` is added. (cherry picked from commit 664a0faac464708fc061d12e5cd492fcbfea979a) --- .../llvm/Target/TargetInstrPredicate.td | 34 +++ llvm/utils/TableGen/PredicateExpander.cpp | 34 +++ llvm/utils/TableGen/PredicateExpander.h | 4 +++ 3 files changed, 72 insertions(+) diff --git a/llvm/include/llvm/Target/TargetInstrPredicate.td b/llvm/include/llvm/Target/TargetInstrPredicate.td index 82c4c7b23a49b6a..b5419cb9f3867f0 100644 --- a/llvm/include/llvm/Target/TargetInstrPredicate.td +++ b/llvm/include/llvm/Target/TargetInstrPredicate.td @@ -152,6 +152,34 @@ class CheckImmOperand_s : CheckOperandBase { string ImmVal = Value; } +// Check that the operand at position `Index` is less than `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandLT : CheckOperandBase { + int ImmVal = Imm; +} + +// Check that the operand at position `Index` is greater than `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandGT : CheckOperandBase { + int ImmVal = Imm; +} + +// Check that the operand at position `Index` is less than or equal to `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandLE : CheckNot>; + +// Check that the operand at position `Index` is greater than or equal to `Imm`. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against `Imm`. +class CheckImmOperandGE : CheckNot>; + // Expands to a call to `FunctionMapper` if field `FunctionMapper` is set. // Otherwise, it expands to a CheckNot>. class CheckRegOperandSimple : CheckOperandBase; @@ -203,6 +231,12 @@ class CheckAll Sequence> class CheckAny Sequence> : CheckPredicateSequence; +// Check that the operand at position `Index` is in range [Start, End]. +// If field `FunctionMapper` is a non-empty string, then function +// `FunctionMapper` is applied to the operand value, and the return value is then +// compared against range [Start, End]. +class CheckImmOperandRange + : CheckAll<[CheckImmOperandGE, CheckImmOperandLE]>; // Used to expand the body of a function predicate. See the definition of // TIIPredicate below. diff --git a/llvm/utils/TableGen/PredicateExpander.cpp b/llvm/utils/TableGen/PredicateExpander.cpp index d3a73e02cd916f8..0b9b6389fe38171 100644 --- a/llvm/utils/TableGen/PredicateExpander.cpp +++ b/llvm/utils/TableGen/PredicateExpander.cpp @@ -59,6 +59,30 @@ void PredicateExpander::expandCheckImmOperandSimple(raw_ostream &OS, OS << ")"; } +void PredicateExpander::expandCheckImmOperandLT(raw_ostream &OS, int OpIndex, +int ImmVal, +StringRef FunctionMapper) { + if (!FunctionMapper.empty()) +OS << FunctionMapper << "("; + OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex + << ").getImm()"; + if (!FunctionMapper.empty()) +OS << ")"; + OS << (shouldNegate() ? " >= " : " < ") << ImmVal; +} + +void PredicateExpander::expandCheckImmOperandGT(raw_ostream &OS, int OpIndex, +int ImmVal, +StringRef FunctionMapper) { + if (!FunctionMapper.empty()) +OS << FunctionMapper << "("; + OS << "MI" << (isByRef() ? "." : "->") << "getOperand(" << OpIndex + << ").getImm()"; + if (!FunctionMapper.empty()) +OS << ")"; + OS << (shouldNegate() ? " <= " : " > ") << ImmVal; +} + void PredicateExpander::expandCheckRegOperand(raw_ostream &OS, int OpIndex, const Record *Reg, StringRef FunctionMapper) { @@ -352,6 +376,16 @@ void PredicateExpander::expandPredicate(raw_ostream &OS, const Record *Rec) { Rec->getValueAsString("ImmVal"), Rec->getValueAsString("FunctionMapper")); + if (Rec->isSubClassOf("CheckImmOperandLT")) +return expandCheckImmOperandLT(OS, Rec->getValueAsInt("OpIndex"), + R