[llvm-branch-commits] [llvm] Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (PR #134408)
sdesmalen-arm wrote: Gentle ping @arsenm and @qcolombet I know that @arsenm is in favour of moving away from `SUBREG_TO_REG` entirely, but at the moment it is still used in many places by multiple targets and this PR fixes a genuine bug that is exposed with sub-reg liveness tracking. https://github.com/llvm/llvm-project/pull/134408 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [AArch64, TTI] Disable RealUse check for vector insert/extract costs and Apple CPUs. (#146526) (PR #149815)
https://github.com/davemgreen approved this pull request. Thanks. The commit message could now do with an adjustment. Otherwise LGTM. https://github.com/llvm/llvm-project/pull/149815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][NPM] Add isRequired to passes missing it (PR #148115)
https://github.com/vikramRH updated https://github.com/llvm/llvm-project/pull/148115 >From fe653178dc8c6cfd0929d5ca5dc7c16e224c696a Mon Sep 17 00:00:00 2001 From: vikhegde Date: Thu, 10 Jul 2025 18:53:39 +0530 Subject: [PATCH] [AMDGPU][NPM] Add isRequired to passes missing it --- llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h | 1 + llvm/include/llvm/Transforms/Utils/LoopSimplify.h | 1 + llvm/lib/Target/AMDGPU/AMDGPU.h| 3 +++ llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h| 1 + llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h | 1 + llvm/lib/Target/AMDGPU/GCNNSAReassign.h| 1 + llvm/lib/Target/AMDGPU/GCNPreRALongBranchReg.h | 1 + llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.h | 1 + llvm/lib/Target/AMDGPU/SIFixSGPRCopies.h | 1 + llvm/lib/Target/AMDGPU/SIFixVGPRCopies.h | 1 + llvm/lib/Target/AMDGPU/SILowerControlFlow.h| 1 + llvm/lib/Target/AMDGPU/SILowerSGPRSpills.h | 1 + llvm/lib/Target/AMDGPU/SILowerWWMCopies.h | 1 + llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h | 1 + llvm/lib/Target/AMDGPU/SIWholeQuadMode.h | 1 + llvm/test/Feature/optnone-opt.ll | 1 - 16 files changed, 17 insertions(+), 1 deletion(-) diff --git a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h index f68067d935458..f50511c9c0972 100644 --- a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h +++ b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h @@ -23,6 +23,7 @@ struct StructurizeCFGPass : PassInfoMixin { function_ref MapClassName2PassName); PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; } // namespace llvm diff --git a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h index 8f3fa1f2b18ef..d179002fd6a27 100644 --- a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h +++ b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h @@ -54,6 +54,7 @@ class ScalarEvolution; class LoopSimplifyPass : public PassInfoMixin { public: LLVM_ABI PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; /// Simplify each loop in a loop nest recursively. diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 007b481f84960..10507aab9132e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -90,6 +90,7 @@ class SILowerI1CopiesPass : public PassInfoMixin { SILowerI1CopiesPass() = default; PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } }; void initializeAMDGPUDAGToDAGISelLegacyPass(PassRegistry &); @@ -368,6 +369,7 @@ class SIModeRegisterPass : public PassInfoMixin { public: SIModeRegisterPass() {} PreservedAnalyses run(MachineFunction &F, MachineFunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; class SIMemoryLegalizerPass : public PassInfoMixin { @@ -480,6 +482,7 @@ class SIAnnotateControlFlowPass public: SIAnnotateControlFlowPass(const AMDGPUTargetMachine &TM) : TM(TM) {} PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; void initializeSIAnnotateControlFlowLegacyPass(PassRegistry &); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h index 6123d75d7b616..38fde6ee2f4a5 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h @@ -304,6 +304,7 @@ class AMDGPUISelDAGToDAGPass : public SelectionDAGISelPass { PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } }; class AMDGPUDAGToDAGISelLegacy : public SelectionDAGISelLegacy { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h index 2fd98a2ee1a93..d6fb0e53e1169 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h @@ -29,6 +29,7 @@ class AMDGPUUnifyDivergentExitNodesPass : public PassInfoMixin { public: PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; } // end namespace llvm diff --git a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h index 97a72e7ddbb24..4f2abe0dd0086 100644 --- a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h +++ b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h @@ -16,6 +16,7 @@ class GCNNSAReassignPass : public PassInfoMixin { public: PreservedAnalyses run(MachineFunctio
[llvm-branch-commits] [llvm] [AMDGPU][NPM] Add isRequired to passes missing it (PR #148115)
https://github.com/vikramRH updated https://github.com/llvm/llvm-project/pull/148115 >From fe653178dc8c6cfd0929d5ca5dc7c16e224c696a Mon Sep 17 00:00:00 2001 From: vikhegde Date: Thu, 10 Jul 2025 18:53:39 +0530 Subject: [PATCH] [AMDGPU][NPM] Add isRequired to passes missing it --- llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h | 1 + llvm/include/llvm/Transforms/Utils/LoopSimplify.h | 1 + llvm/lib/Target/AMDGPU/AMDGPU.h| 3 +++ llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h| 1 + llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h | 1 + llvm/lib/Target/AMDGPU/GCNNSAReassign.h| 1 + llvm/lib/Target/AMDGPU/GCNPreRALongBranchReg.h | 1 + llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.h | 1 + llvm/lib/Target/AMDGPU/SIFixSGPRCopies.h | 1 + llvm/lib/Target/AMDGPU/SIFixVGPRCopies.h | 1 + llvm/lib/Target/AMDGPU/SILowerControlFlow.h| 1 + llvm/lib/Target/AMDGPU/SILowerSGPRSpills.h | 1 + llvm/lib/Target/AMDGPU/SILowerWWMCopies.h | 1 + llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h | 1 + llvm/lib/Target/AMDGPU/SIWholeQuadMode.h | 1 + llvm/test/Feature/optnone-opt.ll | 1 - 16 files changed, 17 insertions(+), 1 deletion(-) diff --git a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h index f68067d935458..f50511c9c0972 100644 --- a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h +++ b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h @@ -23,6 +23,7 @@ struct StructurizeCFGPass : PassInfoMixin { function_ref MapClassName2PassName); PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; } // namespace llvm diff --git a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h index 8f3fa1f2b18ef..d179002fd6a27 100644 --- a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h +++ b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h @@ -54,6 +54,7 @@ class ScalarEvolution; class LoopSimplifyPass : public PassInfoMixin { public: LLVM_ABI PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; /// Simplify each loop in a loop nest recursively. diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 007b481f84960..10507aab9132e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -90,6 +90,7 @@ class SILowerI1CopiesPass : public PassInfoMixin { SILowerI1CopiesPass() = default; PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } }; void initializeAMDGPUDAGToDAGISelLegacyPass(PassRegistry &); @@ -368,6 +369,7 @@ class SIModeRegisterPass : public PassInfoMixin { public: SIModeRegisterPass() {} PreservedAnalyses run(MachineFunction &F, MachineFunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; class SIMemoryLegalizerPass : public PassInfoMixin { @@ -480,6 +482,7 @@ class SIAnnotateControlFlowPass public: SIAnnotateControlFlowPass(const AMDGPUTargetMachine &TM) : TM(TM) {} PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; void initializeSIAnnotateControlFlowLegacyPass(PassRegistry &); diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h index 6123d75d7b616..38fde6ee2f4a5 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h @@ -304,6 +304,7 @@ class AMDGPUISelDAGToDAGPass : public SelectionDAGISelPass { PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } }; class AMDGPUDAGToDAGISelLegacy : public SelectionDAGISelLegacy { diff --git a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h index 2fd98a2ee1a93..d6fb0e53e1169 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h @@ -29,6 +29,7 @@ class AMDGPUUnifyDivergentExitNodesPass : public PassInfoMixin { public: PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM); + static bool isRequired() { return true; } }; } // end namespace llvm diff --git a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h index 97a72e7ddbb24..4f2abe0dd0086 100644 --- a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h +++ b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h @@ -16,6 +16,7 @@ class GCNNSAReassignPass : public PassInfoMixin { public: PreservedAnalyses run(MachineFunctio
[llvm-branch-commits] [clang] [llvm] [DirectX] Add Range Overlap validation to `DXILPostOptimizationValidation.cpp` (PR #148919)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/148919 >From 831dc1cab2662151e0c4a95883f6fb73afc595d4 Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Tue, 15 Jul 2025 01:59:47 + Subject: [PATCH 1/6] adding validation --- .../DXILPostOptimizationValidation.cpp| 152 -- ...signature-validation-fail-overlap-range.ll | 16 ++ 2 files changed, 153 insertions(+), 15 deletions(-) create mode 100644 llvm/test/CodeGen/DirectX/rootsignature-validation-fail-overlap-range.ll diff --git a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp index a09c5ac353fed..e42d2bef62ba7 100644 --- a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp +++ b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp @@ -13,6 +13,7 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/Analysis/DXILMetadataAnalysis.h" #include "llvm/Analysis/DXILResource.h" +#include "llvm/Frontend/HLSL/RootSignatureValidations.h" #include "llvm/IR/DiagnosticInfo.h" #include "llvm/IR/Instructions.h" #include "llvm/IR/IntrinsicsDirectX.h" @@ -209,6 +210,123 @@ getRootSignature(RootSignatureBindingInfo &RSBI, return RootSigDesc; } +static void +reportOverlappingRegisters(Module &M, + llvm::hlsl::rootsig::OverlappingRanges Overlap) { + const llvm::hlsl::rootsig::RangeInfo *Info = Overlap.A; + const llvm::hlsl::rootsig::RangeInfo *OInfo = Overlap.B; + SmallString<128> Message; + raw_svector_ostream OS(Message); + auto ResourceClassToString = + [](llvm::dxil::ResourceClass Class) -> const char * { +switch (Class) { + +case ResourceClass::SRV: + return "SRV"; +case ResourceClass::UAV: + return "UAV"; +case ResourceClass::CBuffer: + return "CBuffer"; +case ResourceClass::Sampler: + return "Sampler"; + break; +} + }; + OS << "register " << ResourceClassToString(Info->Class) + << " (space=" << Info->Space << ", register=" << Info->LowerBound << ")" + << " is overlapping with" + << " register " << ResourceClassToString(OInfo->Class) + << " (space=" << OInfo->Space << ", register=" << OInfo->LowerBound << ")" + << ", verify your root signature definition"; + + M.getContext().diagnose(DiagnosticInfoGeneric(Message)); +} + +static bool reportOverlappingRanges(Module &M, +const mcdxbc::RootSignatureDesc &RSD) { + using namespace llvm::hlsl::rootsig; + + llvm::SmallVector Infos; + // Helper to map DescriptorRangeType to ResourceClass + auto RangeToResourceClass = [](uint32_t RangeType) -> ResourceClass { +using namespace dxbc; +switch (static_cast(RangeType)) { +case DescriptorRangeType::SRV: + return ResourceClass::SRV; +case DescriptorRangeType::UAV: + return ResourceClass::UAV; +case DescriptorRangeType::CBV: + return ResourceClass::CBuffer; +case DescriptorRangeType::Sampler: + return ResourceClass::Sampler; +} + }; + + // Helper to map RootParameterType to ResourceClass + auto ParameterToResourceClass = [](uint32_t Type) -> ResourceClass { +using namespace dxbc; +switch (static_cast(Type)) { +case RootParameterType::SRV: + return ResourceClass::SRV; +case RootParameterType::UAV: + return ResourceClass::UAV; +case RootParameterType::CBV: + return ResourceClass::CBuffer; +default: + llvm_unreachable("Unknown RootParameterType"); +} + }; + + for (size_t I = 0; I < RSD.ParametersContainer.size(); I++) { +const auto &[Type, Loc] = +RSD.ParametersContainer.getTypeAndLocForParameter(I); +const auto &Header = RSD.ParametersContainer.getHeader(I); +switch (Type) { +case llvm::to_underlying(dxbc::RootParameterType::SRV): +case llvm::to_underlying(dxbc::RootParameterType::UAV): +case llvm::to_underlying(dxbc::RootParameterType::CBV): { + dxbc::RTS0::v2::RootDescriptor Desc = + RSD.ParametersContainer.getRootDescriptor(Loc); + + RangeInfo Info; + Info.Space = Desc.RegisterSpace; + Info.LowerBound = Desc.ShaderRegister; + Info.UpperBound = Info.LowerBound; + Info.Class = ParameterToResourceClass(Type); + Info.Visibility = (dxbc::ShaderVisibility)Header.ShaderVisibility; + + Infos.push_back(Info); + break; +} +case llvm::to_underlying(dxbc::RootParameterType::DescriptorTable): { + const mcdxbc::DescriptorTable &Table = + RSD.ParametersContainer.getDescriptorTable(Loc); + + for (const dxbc::RTS0::v2::DescriptorRange &Range : Table.Ranges) { +RangeInfo Info; +Info.Space = Range.RegisterSpace; +Info.LowerBound = Range.BaseShaderRegister; +Info.UpperBound = Info.LowerBound + ((Range.NumDescriptors == ~0U) + ? Range.NumDescriptors + : Range.NumD
[llvm-branch-commits] [clang] [llvm] [DirectX] Add Range Overlap validation to `DXILPostOptimizationValidation.cpp` (PR #148919)
@@ -0,0 +1,15 @@ +; RUN: not opt -S -passes='dxil-post-optimization-validation' -mtriple=dxil-pc-shadermodel6.6-compute %s 2>&1 | FileCheck %s +; CHECK: error: register CBuffer (space=0, register=0) is overlapping with register CBuffer (space=0, register=2), verify your root signature definition + +define void @CSMain() "hlsl.shader"="compute" { +entry: + ret void +} + +; RootConstants(num32BitConstants=4, b2), DescriptorTable(CBV(b0, numDescriptors=3)) joaosaffran wrote: I've updated the test, I've added one for descriptor tables and one for root descriptors. Hopefully that makes it clearer https://github.com/llvm/llvm-project/pull/148919 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [DirectX] Add Range Overlap validation to `DXILPostOptimizationValidation.cpp` (PR #148919)
https://github.com/joaosaffran updated https://github.com/llvm/llvm-project/pull/148919 >From 831dc1cab2662151e0c4a95883f6fb73afc595d4 Mon Sep 17 00:00:00 2001 From: joaosaffran Date: Tue, 15 Jul 2025 01:59:47 + Subject: [PATCH 1/7] adding validation --- .../DXILPostOptimizationValidation.cpp| 152 -- ...signature-validation-fail-overlap-range.ll | 16 ++ 2 files changed, 153 insertions(+), 15 deletions(-) create mode 100644 llvm/test/CodeGen/DirectX/rootsignature-validation-fail-overlap-range.ll diff --git a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp index a09c5ac353fed..e42d2bef62ba7 100644 --- a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp +++ b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp @@ -13,6 +13,7 @@ #include "llvm/ADT/SmallVector.h" #include "llvm/Analysis/DXILMetadataAnalysis.h" #include "llvm/Analysis/DXILResource.h" +#include "llvm/Frontend/HLSL/RootSignatureValidations.h" #include "llvm/IR/DiagnosticInfo.h" #include "llvm/IR/Instructions.h" #include "llvm/IR/IntrinsicsDirectX.h" @@ -209,6 +210,123 @@ getRootSignature(RootSignatureBindingInfo &RSBI, return RootSigDesc; } +static void +reportOverlappingRegisters(Module &M, + llvm::hlsl::rootsig::OverlappingRanges Overlap) { + const llvm::hlsl::rootsig::RangeInfo *Info = Overlap.A; + const llvm::hlsl::rootsig::RangeInfo *OInfo = Overlap.B; + SmallString<128> Message; + raw_svector_ostream OS(Message); + auto ResourceClassToString = + [](llvm::dxil::ResourceClass Class) -> const char * { +switch (Class) { + +case ResourceClass::SRV: + return "SRV"; +case ResourceClass::UAV: + return "UAV"; +case ResourceClass::CBuffer: + return "CBuffer"; +case ResourceClass::Sampler: + return "Sampler"; + break; +} + }; + OS << "register " << ResourceClassToString(Info->Class) + << " (space=" << Info->Space << ", register=" << Info->LowerBound << ")" + << " is overlapping with" + << " register " << ResourceClassToString(OInfo->Class) + << " (space=" << OInfo->Space << ", register=" << OInfo->LowerBound << ")" + << ", verify your root signature definition"; + + M.getContext().diagnose(DiagnosticInfoGeneric(Message)); +} + +static bool reportOverlappingRanges(Module &M, +const mcdxbc::RootSignatureDesc &RSD) { + using namespace llvm::hlsl::rootsig; + + llvm::SmallVector Infos; + // Helper to map DescriptorRangeType to ResourceClass + auto RangeToResourceClass = [](uint32_t RangeType) -> ResourceClass { +using namespace dxbc; +switch (static_cast(RangeType)) { +case DescriptorRangeType::SRV: + return ResourceClass::SRV; +case DescriptorRangeType::UAV: + return ResourceClass::UAV; +case DescriptorRangeType::CBV: + return ResourceClass::CBuffer; +case DescriptorRangeType::Sampler: + return ResourceClass::Sampler; +} + }; + + // Helper to map RootParameterType to ResourceClass + auto ParameterToResourceClass = [](uint32_t Type) -> ResourceClass { +using namespace dxbc; +switch (static_cast(Type)) { +case RootParameterType::SRV: + return ResourceClass::SRV; +case RootParameterType::UAV: + return ResourceClass::UAV; +case RootParameterType::CBV: + return ResourceClass::CBuffer; +default: + llvm_unreachable("Unknown RootParameterType"); +} + }; + + for (size_t I = 0; I < RSD.ParametersContainer.size(); I++) { +const auto &[Type, Loc] = +RSD.ParametersContainer.getTypeAndLocForParameter(I); +const auto &Header = RSD.ParametersContainer.getHeader(I); +switch (Type) { +case llvm::to_underlying(dxbc::RootParameterType::SRV): +case llvm::to_underlying(dxbc::RootParameterType::UAV): +case llvm::to_underlying(dxbc::RootParameterType::CBV): { + dxbc::RTS0::v2::RootDescriptor Desc = + RSD.ParametersContainer.getRootDescriptor(Loc); + + RangeInfo Info; + Info.Space = Desc.RegisterSpace; + Info.LowerBound = Desc.ShaderRegister; + Info.UpperBound = Info.LowerBound; + Info.Class = ParameterToResourceClass(Type); + Info.Visibility = (dxbc::ShaderVisibility)Header.ShaderVisibility; + + Infos.push_back(Info); + break; +} +case llvm::to_underlying(dxbc::RootParameterType::DescriptorTable): { + const mcdxbc::DescriptorTable &Table = + RSD.ParametersContainer.getDescriptorTable(Loc); + + for (const dxbc::RTS0::v2::DescriptorRange &Range : Table.Ranges) { +RangeInfo Info; +Info.Space = Range.RegisterSpace; +Info.LowerBound = Range.BaseShaderRegister; +Info.UpperBound = Info.LowerBound + ((Range.NumDescriptors == ~0U) + ? Range.NumDescriptors + : Range.NumD
[llvm-branch-commits] [llvm] ecd793c - [AMDGPU] Add v_fma_mix_f32_f16 as an alias of v_fma_mix_f32 on gfx1250 (#150502)
Author: Changpeng Fang Date: 2025-07-24T12:42:30-07:00 New Revision: ecd793cbb1888507850b806699e97fc978d15dd7 URL: https://github.com/llvm/llvm-project/commit/ecd793cbb1888507850b806699e97fc978d15dd7 DIFF: https://github.com/llvm/llvm-project/commit/ecd793cbb1888507850b806699e97fc978d15dd7.diff LOG: [AMDGPU] Add v_fma_mix_f32_f16 as an alias of v_fma_mix_f32 on gfx1250 (#150502) Co-authored-by: Jay Foad Added: llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s Modified: llvm/lib/Target/AMDGPU/VOP3PInstructions.td Removed: diff --git a/llvm/lib/Target/AMDGPU/VOP3PInstructions.td b/llvm/lib/Target/AMDGPU/VOP3PInstructions.td index 7017da9dc3521..c812dc9850514 100644 --- a/llvm/lib/Target/AMDGPU/VOP3PInstructions.td +++ b/llvm/lib/Target/AMDGPU/VOP3PInstructions.td @@ -2277,6 +2277,9 @@ defm V_FMA_MIX_F32_BF16 : VOP3P_Realtriple; defm V_FMA_MIXLO_BF16 : VOP3P_Realtriple; defm V_FMA_MIXHI_BF16 : VOP3P_Realtriple; +let AssemblerPredicate = isGFX1250Plus in +def : AMDGPUMnemonicAlias<"v_fma_mix_f32_f16", "v_fma_mix_f32">; + defm V_PK_MINIMUM_F16 : VOP3P_Real_gfx12<0x1d>; defm V_PK_MAXIMUM_F16 : VOP3P_Real_gfx12<0x1e>; diff --git a/llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s b/llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s new file mode 100644 index 0..8d5c11482f909 --- /dev/null +++ b/llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s @@ -0,0 +1,5 @@ +// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --version 5 +// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1250 -show-encoding < %s | FileCheck --check-prefix=GFX1250 %s + +v_fma_mix_f32_f16 v5, v1, v2, s3 +// GFX1250: v_fma_mix_f32 v5, v1, v2, s3; encoding: [0x05,0x00,0x20,0xcc,0x01,0x05,0x0e,0x00] ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] d69ea93 - Merge branch 'main' into revert-143441-atomic-control-frontend
Author: Kiran Chandramohan Date: 2025-07-24T20:43:47+01:00 New Revision: d69ea933c6f243a17d37609d4ac29712dd0b20c6 URL: https://github.com/llvm/llvm-project/commit/d69ea933c6f243a17d37609d4ac29712dd0b20c6 DIFF: https://github.com/llvm/llvm-project/commit/d69ea933c6f243a17d37609d4ac29712dd0b20c6.diff LOG: Merge branch 'main' into revert-143441-atomic-control-frontend Added: llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s Modified: clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp clang-tools-extra/docs/ReleaseNotes.rst clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-absl.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-custom.cpp llvm/lib/Target/AMDGPU/VOP3PInstructions.td Removed: diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp index 7f4ccca84faa5..e1c1bee97f6d4 100644 --- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp +++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp @@ -207,13 +207,9 @@ FormatStringConverter::FormatStringConverter( ArgsOffset(FormatArgOffset + 1), LangOpts(LO) { assert(ArgsOffset <= NumArgs); FormatExpr = llvm::dyn_cast( - Args[FormatArgOffset]->IgnoreImplicitAsWritten()); + Args[FormatArgOffset]->IgnoreUnlessSpelledInSource()); - if (!FormatExpr || !FormatExpr->isOrdinary()) { -// Function must have a narrow string literal as its first argument. -conversionNotPossible("first argument is not a narrow string literal"); -return; - } + assert(FormatExpr && FormatExpr->isOrdinary()); if (const std::optional MaybeMacroName = formatStringContainsUnreplaceableMacro(Call, FormatExpr, SM, PP); diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst index bde4ddec50ff3..cc77a422b97a6 100644 --- a/clang-tools-extra/docs/ReleaseNotes.rst +++ b/clang-tools-extra/docs/ReleaseNotes.rst @@ -124,6 +124,16 @@ Changes in existing checks - Improved :doc:`misc-header-include-cycle ` check performance. +- Improved :doc:`modernize-use-std-format + ` check to correctly match + when the format string is converted to a diff erent type by an implicit + constructor call. + +- Improved :doc:`modernize-use-std-print + ` check to correctly match + when the format string is converted to a diff erent type by an implicit + constructor call. + - Improved :doc:`portability-template-virtual-member-function ` check to avoid false positives on pure virtual member functions. diff --git a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp index 7da0bb02ad766..0f3458e61856a 100644 --- a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp +++ b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp @@ -2,7 +2,7 @@ // RUN: -std=c++20 %s modernize-use-std-format %t -- \ // RUN: -config="{CheckOptions: { \ // RUN: modernize-use-std-format.StrictMode: true, \ -// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \ +// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; any_format_type_strprintf', \ // RUN: modernize-use-std-format.ReplacementFormatFunction: 'fmt::format', \ // RUN: modernize-use-std-format.FormatHeader: '' \ // RUN:}}" \ @@ -10,7 +10,7 @@ // RUN: %check_clang_tidy -check-suffixes=,NOTSTRICT\ // RUN: -std=c++20 %s modernize-use-std-format %t -- \ // RUN: -config="{CheckOptions: { \ -// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \ +// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; any_format_type_strprintf', \ // RUN: modernize-use-std-format.ReplacementFormatFunction: 'fmt::format', \ // RUN: modernize-use-std-format.FormatHeader: '' \ // RUN:}}" \ @@ -56,12 +56,17 @@ std::string A(const std::string &in) struct S { S(...); }; -std::string bad_format_type_strprintf(const S &, ...); +std::string any_format_type_strprintf(const S &, ...); -std::string unsupported
[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)
llvmbot wrote: @llvm/pr-subscribers-pgo Author: Snehasish Kumar (snehasish) Changes The test is fine though it seems the checks weren't being enforced because of the typo. --- Full diff: https://github.com/llvm/llvm-project/pull/150506.diff 1 Files Affected: - (modified) llvm/test/tools/llvm-profdata/memprof-padding-histogram.test (+76-76) ``diff diff --git a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test index 79521f3aceb6d..2d0346e7cb259 100644 --- a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test +++ b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test @@ -21,79 +21,79 @@ CHECK-NEXT: Offset: 0x{{[[:xdigit:]]+}} CHECK-NEXT: - CHECK: Records: -CHEC-NEXTFunctionGUID: {{[0-9]+}} -CHEC-NEXTAllocSites: -CHEC-NEXT- -CHEC-NEXT Callstack: -CHEC-NEXT - -CHEC-NEXTFunction: {{[0-9]+}} -CHEC-NEXTSymbolName: main -CHEC-NEXTLineOffset: 3 -CHEC-NEXTColumn: 10 -CHEC-NEXTInline: 0 -CHEC-NEXT MemInfoBlock: -CHEC-NEXTAllocCount: 1 -CHEC-NEXTTotalAccessCount: 5 -CHEC-NEXTMinAccessCount: 5 -CHEC-NEXTMaxAccessCount: 5 -CHEC-NEXTTotalSize: 24 -CHEC-NEXTMinSize: 24 -CHEC-NEXTMaxSize: 24 -CHEC-NEXTAllocTimestamp: {{[0-9]+}} -CHEC-NEXTDeallocTimestamp: {{[0-9]+}} -CHEC-NEXTTotalLifetime: 0 -CHEC-NEXTMinLifetime: 0 -CHEC-NEXTMaxLifetime: 0 -CHEC-NEXTAllocCpuId: 11 -CHEC-NEXTDeallocCpuId: 11 -CHEC-NEXTNumMigratedCpu: 0 -CHEC-NEXTNumLifetimeOverlaps: 0 -CHEC-NEXTNumSameAllocCpu: 0 -CHEC-NEXTNumSameDeallocCpu: 0 -CHEC-NEXTDataTypeId: 0 -CHEC-NEXTTotalAccessDensity: 20 -CHEC-NEXTMinAccessDensity: 20 -CHEC-NEXTMaxAccessDensity: 20 -CHEC-NEXTTotalLifetimeAccessDensity: 2 -CHEC-NEXTMinLifetimeAccessDensity: 2 -CHEC-NEXTMaxLifetimeAccessDensity: 2 -CHEC-NEXTAccessHistogramSize: 3 -CHEC-NEXTAccessHistogram: {{[0-9]+}} -CHEC-NEXTAccessHistogramValues: -2 -1 -2 -CHEC-NEXT- -CHEC-NEXT Callstack: -CHEC-NEXT - -CHEC-NEXTFunction: {{[0-9]+}} -CHEC-NEXTSymbolName: main -CHEC-NEXTLineOffset: 10 -CHEC-NEXTColumn: 10 -CHEC-NEXTInline: 0 -CHEC-NEXT MemInfoBlock: -CHEC-NEXTAllocCount: 1 -CHEC-NEXTTotalAccessCount: 4 -CHEC-NEXTMinAccessCount: 4 -CHEC-NEXTMaxAccessCount: 4 -CHEC-NEXTTotalSize: 48 -CHEC-NEXTMinSize: 48 -CHEC-NEXTMaxSize: 48 -CHEC-NEXTAllocTimestamp: {{[0-9]+}} -CHEC-NEXTDeallocTimestamp: {{[0-9]+}} -CHEC-NEXTTotalLifetime: 0 -CHEC-NEXTMinLifetime: 0 -CHEC-NEXTMaxLifetime: 0 -CHEC-NEXTAllocCpuId: 11 -CHEC-NEXTDeallocCpuId: 11 -CHEC-NEXTNumMigratedCpu: 0 -CHEC-NEXTNumLifetimeOverlaps: 0 -CHEC-NEXTNumSameAllocCpu: 0 -CHEC-NEXTNumSameDeallocCpu: 0 -CHEC-NEXTDataTypeId: 0 -CHEC-NEXTTotalAccessDensity: 8 -CHEC-NEXTMinAccessDensity: 8 -CHEC-NEXTMaxAccessDensity: 8 -CHEC-NEXTTotalLifetimeAccessDensity: 8000 -CHEC-NEXTMinLifetimeAccessDensity: 8000 -CHEC-NEXTMaxLifetimeAccessDensity: 8000 -CHEC-NEXTAccessHistogramSize: 6 -CHEC-NEXTAccessHistogram: {{[0-9]+}} -CHEC-NEXTAccessHistogramValues: -2 -0 -0 -0 -1 -1 +CHECK-NEXTFunctionGUID: {{[0-9]+}} +CHECK-NEXTAllocSites: +CHECK-NEXT- +CHECK-NEXT Callstack: +CHECK-NEXT - +CHECK-NEXTFunction: {{[0-9]+}} +CHECK-NEXTSymbolName: main +CHECK-NEXTLineOffset: 3 +CHECK-NEXTColumn: 10 +CHECK-NEXTInline: 0 +CHECK-NEXT MemInfoBlock: +CHECK-NEXTAllocCount: 1 +CHECK-NEXTTotalAccessCount: 5 +CHECK-NEXTMinAccessCount: 5 +CHECK-NEXTMaxAccessCount: 5 +CHECK-NEXTTotalSize: 24 +CHECK-NEXTMinSize: 24 +CHECK-NEXTMaxSize: 24 +CHECK-NEXTAllocTimestamp: {{[0-9]+}} +CHECK-NEXTDeallocTimestamp: {{[0-9]+}} +CHECK-NEXTTotalLifetime: 0 +CHECK-NEXTMinLifetime: 0 +CHECK-NEXTMaxLifetime: 0 +CHECK-NEXTAllocCpuId: 11 +CHECK-NEXTDeallocCpuId: 11 +CHECK-NEXTNumMigratedCpu: 0 +CHECK-NEXTNumLifetimeOverlaps: 0 +CHECK-NEXTNumSameAllocCpu: 0 +CHECK-NEXTNumSameDeallocCpu: 0 +CHECK-NEXTDataTypeId: 0 +CHECK-NEXTTotalAccessDensity: 20 +CHECK-NEXTMinAccessDensity: 20 +CHECK-NEXTMaxAccessDensity: 20 +CHECK-NEXTTotalLifetimeAccessDensity: 2 +CHECK-NEXTMinLifetimeAccessDensity: 2 +CHECK-NEXTMaxLifetimeAccessDensity: 2 +CHECK-NEXTAccessHistogramSize: 3 +CHECK-NEXTAccessHistogram: {{[0-9]+}} +CHECK-NEXTAccessHistogramValues: -2 -1 -2 +CHECK-NE
[llvm-branch-commits] [llvm] [BOLT] Require CFG in BAT mode (PR #150488)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/150488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Require CFG in BAT mode (PR #150488)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/150488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)
https://github.com/jdenny-ornl edited https://github.com/llvm/llvm-project/pull/128785 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)
https://github.com/evelez7 updated https://github.com/llvm/llvm-project/pull/150468 >From b388252f5857e5004cfd26ab05037f13df66657b Mon Sep 17 00:00:00 2001 From: Erick Velez Date: Fri, 18 Jul 2025 13:03:07 -0700 Subject: [PATCH] [clang-doc] generate comments for functions Change the function partial to enable comments to be generated for functions. This only enables the brief comments in the basic project. --- .../assets/function-template.mustache | 4 +- .../clang-doc/basic-project.mustache.test | 302 +- 2 files changed, 153 insertions(+), 153 deletions(-) diff --git a/clang-tools-extra/clang-doc/assets/function-template.mustache b/clang-tools-extra/clang-doc/assets/function-template.mustache index 6683afa03ea43..2510a4de2cd68 100644 --- a/clang-tools-extra/clang-doc/assets/function-template.mustache +++ b/clang-tools-extra/clang-doc/assets/function-template.mustache @@ -14,10 +14,10 @@ {{! Function Comments }} -{{#FunctionComments}} +{{#Description}} {{>Comments}} -{{/FunctionComments}} +{{/Description}} diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test b/clang-tools-extra/test/clang-doc/basic-project.mustache.test index 7cc32b9d8f08a..4cf8bad32fd9d 100644 --- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test +++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test @@ -83,17 +83,17 @@ HTML-SHAPE: HTML-SHAPE: double area () HTML-SHAPE: HTML-SHAPE: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: Calculates the area of the shape. -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: Calculates the area of the shape. +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: HTML-SHAPE: HTML-SHAPE: HTML-SHAPE: @@ -103,17 +103,17 @@ HTML-SHAPE: HTML-SHAPE: double perimeter () HTML-SHAPE: HTML-SHAPE: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: Calculates the perimeter of the shape. -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: Calculates the perimeter of the shape. +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: HTML-SHAPE: HTML-SHAPE: HTML-SHAPE: @@ -123,14 +123,14 @@ HTML-SHAPE: HTML-SHAPE: void ~Shape () HTML-SHAPE: HTML-SHAPE: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: Virtual destructor. -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: -HTML-SHAPE-NOT: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: Virtual destructor. +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: +HTML-SHAPE: HTML-SHAPE: HTML-SHAPE: HTML-SHAPE: @@ -250,17 +250,17 @@ HTML-CALC: HTML-CALC: int add (int a, int b) HTML-CALC: HTML-CALC: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: Adds two integers. -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT:
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Precommit param comment test changes (PR #150469)
https://github.com/evelez7 updated https://github.com/llvm/llvm-project/pull/150469 >From 6f213799caf42bb3ba0c00822cef55a2e2948cb4 Mon Sep 17 00:00:00 2001 From: Erick Velez Date: Tue, 22 Jul 2025 21:49:57 -0700 Subject: [PATCH] [clang-doc] Precommit param comment test changes --- .../clang-doc/basic-project.mustache.test | 92 ++- 1 file changed, 90 insertions(+), 2 deletions(-) diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test b/clang-tools-extra/test/clang-doc/basic-project.mustache.test index 4cf8bad32fd9d..807ba1319e393 100644 --- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test +++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test @@ -259,7 +259,24 @@ HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: -HTML-CALC: +HTML-CALC-NOT: +HTML-CALC-NOT:Parameters +HTML-CALC-NOT: +HTML-CALC-NOT:a +HTML-CALC-NOT: First integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT:b +HTML-CALC-NOT: Second integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: HTML-CALC: HTML-CALC: HTML-CALC: @@ -299,7 +316,24 @@ HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: -HTML-CALC: +HTML-CALC-NOT: +HTML-CALC-NOT:Parameters +HTML-CALC-NOT: +HTML-CALC-NOT:a +HTML-CALC-NOT: First integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT:b +HTML-CALC-NOT: Second integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: HTML-CALC: HTML-CALC: HTML-CALC: @@ -319,6 +353,23 @@ HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: +HTML-CALC-NOT: +HTML-CALC-NOT:Parameters +HTML-CALC-NOT: +HTML-CALC-NOT:a +HTML-CALC-NOT: First integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT:b +HTML-CALC-NOT: Second integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: HTML-CALC: HTML-CALC: HTML-CALC: @@ -339,6 +390,23 @@ HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: +HTML-CALC-NOT: +HTML-CALC-NOT:Parameters +HTML-CALC-NOT: +HTML-CALC-NOT:a +HTML-CALC-NOT: First integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT:b +HTML-CALC-NOT: Second integer. +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: +HTML-CALC-NOT: HTML-CALC: HTML-CALC: HTML-CALC: @@ -438,6 +506,20 @@ HTML-RECTANGLE: HTML-RECTANGLE: HTML-RECTANGLE: HTML-RECTANGLE: +HTML-RECTANGLE-NOT: +HTML-RECTANGLE-NOT:Parameters +HTML-RECTANGLE-NOT: +HTML-RECTANGLE-NOT:width +HTML-RECTANGLE-NOT: Width of the rectan
[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)
https://github.com/evelez7 updated https://github.com/llvm/llvm-project/pull/150470 >From 98172493abfb2c93caefe2424dd17b93d32c17a0 Mon Sep 17 00:00:00 2001 From: Erick Velez Date: Tue, 22 Jul 2025 21:15:36 -0700 Subject: [PATCH] [clang-doc] add param comments to comment template --- clang-tools-extra/clang-doc/JSONGenerator.cpp | 6 +- .../assets/comment-template.mustache | 8 + .../clang-doc/basic-project.mustache.test | 180 +- 3 files changed, 102 insertions(+), 92 deletions(-) diff --git a/clang-tools-extra/clang-doc/JSONGenerator.cpp b/clang-tools-extra/clang-doc/JSONGenerator.cpp index 92a4117c4e534..5fc28406ee870 100644 --- a/clang-tools-extra/clang-doc/JSONGenerator.cpp +++ b/clang-tools-extra/clang-doc/JSONGenerator.cpp @@ -147,8 +147,10 @@ static Object serializeComment(const CommentInfo &I, Object &Description) { Child.insert({"ParamName", I.ParamName}); Child.insert({"Direction", I.Direction}); Child.insert({"Explicit", I.Explicit}); -Child.insert({"Children", ChildArr}); -Obj.insert({commentKindToString(I.Kind), ChildVal}); +auto TextCommentsArray = extractTextComments(CARef.front().getAsObject()); +Child.insert({"Children", TextCommentsArray}); +if (I.Kind == CommentKind::CK_ParamCommandComment) + insertComment(Description, ChildVal, "ParamComments"); return Obj; } diff --git a/clang-tools-extra/clang-doc/assets/comment-template.mustache b/clang-tools-extra/clang-doc/assets/comment-template.mustache index f2edb1b2eb9ac..d55a53194ee5c 100644 --- a/clang-tools-extra/clang-doc/assets/comment-template.mustache +++ b/clang-tools-extra/clang-doc/assets/comment-template.mustache @@ -24,6 +24,14 @@ {{>Comments}} {{/Children}} {{/ParagraphComment}} +{{#HasParamComments}} +Parameters +{{#ParamComments}} + +{{ParamName}} {{#Explicit}}{{Direction}}{{/Explicit}} {{#Children}}{{>Comments}}{{/Children}} + +{{/ParamComments}} +{{/HasParamComments}} {{#BlockCommandComment}} diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test b/clang-tools-extra/test/clang-doc/basic-project.mustache.test index 807ba1319e393..b55e0abe2cdef 100644 --- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test +++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test @@ -259,24 +259,24 @@ HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: -HTML-CALC-NOT: -HTML-CALC-NOT:Parameters -HTML-CALC-NOT: -HTML-CALC-NOT:a -HTML-CALC-NOT: First integer. -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT:b -HTML-CALC-NOT: Second integer. -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: +HTML-CALC: +HTML-CALC:Parameters +HTML-CALC: +HTML-CALC:a +HTML-CALC: First integer. +HTML-CALC: +HTML-CALC: +HTML-CALC: +HTML-CALC: +HTML-CALC: +HTML-CALC: +HTML-CALC:b +HTML-CALC: Second integer. +HTML-CALC: +HTML-CALC: +HTML-CALC: +HTML-CALC: +HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: @@ -316,24 +316,24 @@ HTML-CALC: HTML-CALC: HTML-CALC: HTML-CALC: -HTML-CALC-NOT: -HTML-CALC-NOT:Parameters -HTML-CALC-NOT: -HTML-CALC-NOT:a -HTML-CALC-NOT: First integer. -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT:b -HTML-CALC-NOT: Second integer. -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: -HTML-CALC-NOT: +HTML-CALC: +HTML-CALC:Parameters +HTML-CALC: +HTML-CALC:
[llvm-branch-commits] [llvm] release/21.x: [X86] getTargetConstantBitsFromNode - early-out if the element bitsize doesn't align with the source bitsize (#150184) (PR #150478)
phoebewang wrote: > @phoebewang What do you think about merging this PR to the release branch? LGTM. https://github.com/llvm/llvm-project/pull/150478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [X86] getTargetConstantBitsFromNode - early-out if the element bitsize doesn't align with the source bitsize (#150184) (PR #150478)
https://github.com/phoebewang approved this pull request. https://github.com/llvm/llvm-project/pull/150478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)
https://github.com/phoebewang approved this pull request. https://github.com/llvm/llvm-project/pull/150402 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)
https://github.com/snehasish created https://github.com/llvm/llvm-project/pull/150506 None >From f57f3845aa1a6f03a605096e57e5345ebf3131b5 Mon Sep 17 00:00:00 2001 From: Snehasish Kumar Date: Thu, 24 Jul 2025 06:25:00 + Subject: [PATCH] Fix FileCheck prefix in the histogram test. --- .../memprof-padding-histogram.test| 152 +- 1 file changed, 76 insertions(+), 76 deletions(-) diff --git a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test index 79521f3aceb6d..2d0346e7cb259 100644 --- a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test +++ b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test @@ -21,79 +21,79 @@ CHECK-NEXT: Offset: 0x{{[[:xdigit:]]+}} CHECK-NEXT: - CHECK: Records: -CHEC-NEXTFunctionGUID: {{[0-9]+}} -CHEC-NEXTAllocSites: -CHEC-NEXT- -CHEC-NEXT Callstack: -CHEC-NEXT - -CHEC-NEXTFunction: {{[0-9]+}} -CHEC-NEXTSymbolName: main -CHEC-NEXTLineOffset: 3 -CHEC-NEXTColumn: 10 -CHEC-NEXTInline: 0 -CHEC-NEXT MemInfoBlock: -CHEC-NEXTAllocCount: 1 -CHEC-NEXTTotalAccessCount: 5 -CHEC-NEXTMinAccessCount: 5 -CHEC-NEXTMaxAccessCount: 5 -CHEC-NEXTTotalSize: 24 -CHEC-NEXTMinSize: 24 -CHEC-NEXTMaxSize: 24 -CHEC-NEXTAllocTimestamp: {{[0-9]+}} -CHEC-NEXTDeallocTimestamp: {{[0-9]+}} -CHEC-NEXTTotalLifetime: 0 -CHEC-NEXTMinLifetime: 0 -CHEC-NEXTMaxLifetime: 0 -CHEC-NEXTAllocCpuId: 11 -CHEC-NEXTDeallocCpuId: 11 -CHEC-NEXTNumMigratedCpu: 0 -CHEC-NEXTNumLifetimeOverlaps: 0 -CHEC-NEXTNumSameAllocCpu: 0 -CHEC-NEXTNumSameDeallocCpu: 0 -CHEC-NEXTDataTypeId: 0 -CHEC-NEXTTotalAccessDensity: 20 -CHEC-NEXTMinAccessDensity: 20 -CHEC-NEXTMaxAccessDensity: 20 -CHEC-NEXTTotalLifetimeAccessDensity: 2 -CHEC-NEXTMinLifetimeAccessDensity: 2 -CHEC-NEXTMaxLifetimeAccessDensity: 2 -CHEC-NEXTAccessHistogramSize: 3 -CHEC-NEXTAccessHistogram: {{[0-9]+}} -CHEC-NEXTAccessHistogramValues: -2 -1 -2 -CHEC-NEXT- -CHEC-NEXT Callstack: -CHEC-NEXT - -CHEC-NEXTFunction: {{[0-9]+}} -CHEC-NEXTSymbolName: main -CHEC-NEXTLineOffset: 10 -CHEC-NEXTColumn: 10 -CHEC-NEXTInline: 0 -CHEC-NEXT MemInfoBlock: -CHEC-NEXTAllocCount: 1 -CHEC-NEXTTotalAccessCount: 4 -CHEC-NEXTMinAccessCount: 4 -CHEC-NEXTMaxAccessCount: 4 -CHEC-NEXTTotalSize: 48 -CHEC-NEXTMinSize: 48 -CHEC-NEXTMaxSize: 48 -CHEC-NEXTAllocTimestamp: {{[0-9]+}} -CHEC-NEXTDeallocTimestamp: {{[0-9]+}} -CHEC-NEXTTotalLifetime: 0 -CHEC-NEXTMinLifetime: 0 -CHEC-NEXTMaxLifetime: 0 -CHEC-NEXTAllocCpuId: 11 -CHEC-NEXTDeallocCpuId: 11 -CHEC-NEXTNumMigratedCpu: 0 -CHEC-NEXTNumLifetimeOverlaps: 0 -CHEC-NEXTNumSameAllocCpu: 0 -CHEC-NEXTNumSameDeallocCpu: 0 -CHEC-NEXTDataTypeId: 0 -CHEC-NEXTTotalAccessDensity: 8 -CHEC-NEXTMinAccessDensity: 8 -CHEC-NEXTMaxAccessDensity: 8 -CHEC-NEXTTotalLifetimeAccessDensity: 8000 -CHEC-NEXTMinLifetimeAccessDensity: 8000 -CHEC-NEXTMaxLifetimeAccessDensity: 8000 -CHEC-NEXTAccessHistogramSize: 6 -CHEC-NEXTAccessHistogram: {{[0-9]+}} -CHEC-NEXTAccessHistogramValues: -2 -0 -0 -0 -1 -1 +CHECK-NEXTFunctionGUID: {{[0-9]+}} +CHECK-NEXTAllocSites: +CHECK-NEXT- +CHECK-NEXT Callstack: +CHECK-NEXT - +CHECK-NEXTFunction: {{[0-9]+}} +CHECK-NEXTSymbolName: main +CHECK-NEXTLineOffset: 3 +CHECK-NEXTColumn: 10 +CHECK-NEXTInline: 0 +CHECK-NEXT MemInfoBlock: +CHECK-NEXTAllocCount: 1 +CHECK-NEXTTotalAccessCount: 5 +CHECK-NEXTMinAccessCount: 5 +CHECK-NEXTMaxAccessCount: 5 +CHECK-NEXTTotalSize: 24 +CHECK-NEXTMinSize: 24 +CHECK-NEXTMaxSize: 24 +CHECK-NEXTAllocTimestamp: {{[0-9]+}} +CHECK-NEXTDeallocTimestamp: {{[0-9]+}} +CHECK-NEXTTotalLifetime: 0 +CHECK-NEXTMinLifetime: 0 +CHECK-NEXTMaxLifetime: 0 +CHECK-NEXTAllocCpuId: 11 +CHECK-NEXTDeallocCpuId: 11 +CHECK-NEXTNumMigratedCpu: 0 +CHECK-NEXTNumLifetimeOverlaps: 0 +CHECK-NEXTNumSameAllocCpu: 0 +CHECK-NEXTNumSameDeallocCpu: 0 +CHECK-NEXTDataTypeId: 0 +CHECK-NEXTTotalAccessDensity: 20 +CHECK-NEXTMinAccessDensity: 20 +CHECK-NEXTMaxAccessDensity: 20 +CHECK-NEXTTotalLifetimeAccessDensity: 2 +CHECK-NEXTMinLifetimeAccessDensity: 2 +CHECK-NEXTMaxLifetimeAccessDensity: 2 +CHECK-NEXTAccessHistogramSize: 3 +CHECK-NEXTAccessHistogram: {{[0-9]+}} +CHECK-NEXT
[llvm-branch-commits] [clang-tools-extra] 60bf979 - [clang-tidy] modernize-use-std-print, format: Fix checks with Abseil functions (#142312)
Author: Mike Crowe Date: 2025-07-24T22:40:41+03:00 New Revision: 60bf97983df3efeb17f6db19b811b68fa74df9aa URL: https://github.com/llvm/llvm-project/commit/60bf97983df3efeb17f6db19b811b68fa74df9aa DIFF: https://github.com/llvm/llvm-project/commit/60bf97983df3efeb17f6db19b811b68fa74df9aa.diff LOG: [clang-tidy] modernize-use-std-print,format: Fix checks with Abseil functions (#142312) These checks previously failed with absl::StrFormat and absl::PrintF etc. with: Unable to use 'std::format' instead of 'StrFormat' because first argument is not a narrow string literal [modernize-use-std-format] because FormatStringConverter was rejecting the format string if it had already converted into a different type. Fix the tests so that they check this case properly by accepting string_view rather than const char * and fix the check so that these tests pass. Update the existing tests that checked for the error message that can no longer happen. Fixes: https://github.com/llvm/llvm-project/issues/129484 Added: Modified: clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp clang-tools-extra/docs/ReleaseNotes.rst clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-absl.cpp clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-custom.cpp Removed: diff --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp index 7f4ccca84faa5..e1c1bee97f6d4 100644 --- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp +++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp @@ -207,13 +207,9 @@ FormatStringConverter::FormatStringConverter( ArgsOffset(FormatArgOffset + 1), LangOpts(LO) { assert(ArgsOffset <= NumArgs); FormatExpr = llvm::dyn_cast( - Args[FormatArgOffset]->IgnoreImplicitAsWritten()); + Args[FormatArgOffset]->IgnoreUnlessSpelledInSource()); - if (!FormatExpr || !FormatExpr->isOrdinary()) { -// Function must have a narrow string literal as its first argument. -conversionNotPossible("first argument is not a narrow string literal"); -return; - } + assert(FormatExpr && FormatExpr->isOrdinary()); if (const std::optional MaybeMacroName = formatStringContainsUnreplaceableMacro(Call, FormatExpr, SM, PP); diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst index bde4ddec50ff3..cc77a422b97a6 100644 --- a/clang-tools-extra/docs/ReleaseNotes.rst +++ b/clang-tools-extra/docs/ReleaseNotes.rst @@ -124,6 +124,16 @@ Changes in existing checks - Improved :doc:`misc-header-include-cycle ` check performance. +- Improved :doc:`modernize-use-std-format + ` check to correctly match + when the format string is converted to a diff erent type by an implicit + constructor call. + +- Improved :doc:`modernize-use-std-print + ` check to correctly match + when the format string is converted to a diff erent type by an implicit + constructor call. + - Improved :doc:`portability-template-virtual-member-function ` check to avoid false positives on pure virtual member functions. diff --git a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp index 7da0bb02ad766..0f3458e61856a 100644 --- a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp +++ b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp @@ -2,7 +2,7 @@ // RUN: -std=c++20 %s modernize-use-std-format %t -- \ // RUN: -config="{CheckOptions: { \ // RUN: modernize-use-std-format.StrictMode: true, \ -// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \ +// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; any_format_type_strprintf', \ // RUN: modernize-use-std-format.ReplacementFormatFunction: 'fmt::format', \ // RUN: modernize-use-std-format.FormatHeader: '' \ // RUN:}}" \ @@ -10,7 +10,7 @@ // RUN: %check_clang_tidy -check-suffixes=,NOTSTRICT\ // RUN: -std=c++20 %s modernize-use-std-format %t -- \ // RUN: -config="{CheckOptions: { \ -// RUN: modernize-use-std-format.StrFormatLikeFunctions: '::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \ +// RUN: modernize-use-std-format.S
[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)
https://github.com/evelez7 closed https://github.com/llvm/llvm-project/pull/150468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] Write out raw profile bytes in little endian. (PR #150375)
@@ -23,7 +20,16 @@ using ::llvm::memprof::encodeHistogramCount; namespace { template char *WriteBytes(const T &Pod, char *Buffer) { - *(T *)Buffer = Pod; + static_assert(is_trivially_copyable::value, "T must be POD"); + const uint8_t *Src = reinterpret_cast(&Pod); + for (size_t I = 0; I < sizeof(T); ++I) { +Buffer[I] = Src[I]; + } +#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + for (size_t i = 0; i < sizeof(T) / 2; ++i) { +std::swap(buffer[i], buffer[sizeof(T) - 1 - i]); teresajohnson wrote: alternatively, copy in from Src above in the current direction if little endian, and in reverse order if big endian (rather than copy and swap in the BE case)? https://github.com/llvm/llvm-project/pull/150375 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] Write out raw profile bytes in little endian. (PR #150375)
@@ -23,7 +20,16 @@ using ::llvm::memprof::encodeHistogramCount; namespace { template char *WriteBytes(const T &Pod, char *Buffer) { - *(T *)Buffer = Pod; + static_assert(is_trivially_copyable::value, "T must be POD"); + const uint8_t *Src = reinterpret_cast(&Pod); + for (size_t I = 0; I < sizeof(T); ++I) { +Buffer[I] = Src[I]; + } +#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + for (size_t i = 0; i < sizeof(T) / 2; ++i) { +std::swap(buffer[i], buffer[sizeof(T) - 1 - i]); teresajohnson wrote: buffer should be Buffer? https://github.com/llvm/llvm-project/pull/150375 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)
https://github.com/evelez7 closed https://github.com/llvm/llvm-project/pull/150470 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)
https://github.com/evelez7 reopened https://github.com/llvm/llvm-project/pull/150470 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)
https://github.com/evelez7 closed https://github.com/llvm/llvm-project/pull/150468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)
https://github.com/evelez7 closed https://github.com/llvm/llvm-project/pull/150470 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] Precommit param comment test changes (PR #150469)
https://github.com/evelez7 closed https://github.com/llvm/llvm-project/pull/150469 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)
https://github.com/evelez7 reopened https://github.com/llvm/llvm-project/pull/150468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/adam-smnk approved this pull request. Looks good, great change 👍 https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -150,9 +150,15 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [ `padding_value` specifies a padding value at the boundary on non-perfectly divisible dimensions. Padding is optional: -- If absent, it is UB if the tile does not perfectly divide the dimension. +- If absent, it is assumed that for all inner tiles, + `shape(source)[inner_dims_pos[i]] % inner_tiles[i] == 0`, i.e. all inner + tiles divide perfectly the corresponding outer dimension in the result + tensor. - If present, it will pad along high dimensions (high-padding) to make the - tile complete. + tile complete. Note that it is not allowed to have artificial padding that + is not strictly required by linalg.pack (i.e., padding past what is needed + to complete the last tile along each packed dimension). It is UB if extra + padding is requested. adam-smnk wrote: > Shouldn't that be verification error? It's not possible to enforce that with dynamic source. `UB` is more of "catch all" here and allows `linalg::lowerPack` to remain as is. > restore UB for the previous point It could remain there to reinforce the message. But no strong preference here. https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
@@ -4717,6 +4697,12 @@ static LogicalResult commonVerifierPackAndUnPackOp(OpTy packOrUnPack) { return op->emitError("mismatch in inner tile sizes specified and shaped of " "tiled dimension in the packed type"); } + if (failed(verifyCompatibleShape(expectedPackedType.getShape(), + packedType.getShape( { +return op->emitError("expected ") + << expectedPackedType << " for the unpacked domain value, got " adam-smnk wrote: nit: I think it should be `packed domain` - result for `pack`, input for `unpack` https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/adam-smnk edited https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Test All Projects On Workflow Changes (PR #150250)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/150250 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Test All Projects On Workflow Changes (PR #150250)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/150250 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/150251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/150251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/150251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/150251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/21.x: [flang][OpenMP] Avoid analyzing assumed-size array bases (#150324) (PR #150411)
https://github.com/tblah approved this pull request. https://github.com/llvm/llvm-project/pull/150411 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen] Prevent register coalescer rematerialization based on target (PR #148430)
https://github.com/tomershafir closed https://github.com/llvm/llvm-project/pull/148430 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen] Add 2 subtarget hooks canLowerToZeroCycleReg[Move|Zeroing] (PR #148428)
https://github.com/tomershafir closed https://github.com/llvm/llvm-project/pull/148428 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen] Add 2 subtarget hooks canLowerToZeroCycleReg[Move|Zeroing] (PR #148428)
tomershafir wrote: Retreating back to a single commit patch for all of the changes, as the stacked PR is hard to operate. https://github.com/llvm/llvm-project/pull/148428 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CodeGen] Add target hook shouldReMaterializeTrivialRegDef (PR #148429)
tomershafir wrote: Retreating back to a single commit patch for all of the changes, as the stacked PR is hard to operate. https://github.com/llvm/llvm-project/pull/148429 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CodeGen] Add target hook shouldReMaterializeTrivialRegDef (PR #148429)
https://github.com/tomershafir closed https://github.com/llvm/llvm-project/pull/148429 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen] Prevent register coalescer rematerialization based on target (PR #148430)
tomershafir wrote: Retreating back to a single commit patch for all of the changes, as the stacked PR is hard to operate. https://github.com/llvm/llvm-project/pull/148430 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Handle rewriting non-tied MFMA to AGPR form (PR #149027)
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/149027 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] wip: MIR pretty printing for S_WAITCNT_FENCE_soft (PR #150391)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Sameer Sahasrabuddhe (ssahasra) Changes --- Patch is 34.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150391.diff 7 Files Affected: - (modified) llvm/lib/CodeGen/MIRParser/MIParser.cpp (+10-15) - (modified) llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp (+161) - (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+6-2) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll (+36-36) - (added) llvm/test/CodeGen/AMDGPU/fence-parameters.mir (+29) - (modified) llvm/test/CodeGen/AMDGPU/insert-waitcnts-fence-soft.mir (+9-9) - (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local.mir (+12-12) ``diff diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp index 3a364d5ff0d20..c8ad286a87a35 100644 --- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp +++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp @@ -1850,28 +1850,25 @@ bool MIParser::parseImmediateOperand(MachineOperand &Dest) { return false; } +// The target mnemonic is an expression of the form: +// +// Dot(IntegerLiteral|Identifier|Dot)+ +// +// We could be stricter like not terminating in a dot, but that's note important +// where this is being used. bool MIParser::parseTargetImmMnemonic(const unsigned OpCode, const unsigned OpIdx, MachineOperand &Dest, const MIRFormatter &MF) { assert(Token.is(MIToken::dot)); auto Loc = Token.location(); // record start position - size_t Len = 1; // for "." - lex(); - - // Handle the case that mnemonic starts with number. - if (Token.is(MIToken::IntegerLiteral)) { + size_t Len = 0; + while (Token.is(MIToken::IntegerLiteral) || Token.is(MIToken::dot) || + Token.is(MIToken::Identifier)) { Len += Token.range().size(); lex(); } - - StringRef Src; - if (Token.is(MIToken::comma)) -Src = StringRef(Loc, Len); - else { -assert(Token.is(MIToken::Identifier)); -Src = StringRef(Loc, Len + Token.stringValue().size()); - } + StringRef Src(Loc, Len); int64_t Val; if (MF.parseImmMnemonic(OpCode, OpIdx, Src, Val, [this](StringRef::iterator Loc, const Twine &Msg) @@ -1879,8 +1876,6 @@ bool MIParser::parseTargetImmMnemonic(const unsigned OpCode, return true; Dest = MachineOperand::CreateImm(Val); - if (!Token.is(MIToken::comma)) -lex(); return false; } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp index 75e3d8c426e73..f318d6ffc1bae 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp @@ -12,10 +12,135 @@ //===--===// #include "AMDGPUMIRFormatter.h" +#include "SIDefines.h" #include "SIMachineFunctionInfo.h" using namespace llvm; +bool parseAtomicOrdering(StringRef Src, unsigned &Order) { + Src.consume_front("."); + for (unsigned I = 0; I <= (unsigned)AtomicOrdering::LAST; ++I) { +if (Src == toIRString((AtomicOrdering)I)) { + Order = I; + return true; +} + } + Order = ~0u; + return false; +} + +static const char *fmtScope(unsigned Scope) { + static const char *Names[] = {"none", "singlethread", "wavefront", +"workgroup", "agent","system"}; + return Names[Scope]; +} + +bool parseAtomicScope(StringRef Src, unsigned &Scope) { + Src.consume_front("."); + for (unsigned I = 0; + I != (unsigned)AMDGPU::SIAtomicScope::NUM_SI_ATOMIC_SCOPES; ++I) { +if (Src == fmtScope(I)) { + Scope = I; + return true; +} + } + Scope = ~0u; + return false; +} + +static const char *fmtAddrSpace(unsigned Space) { + static const char *Names[] = {"none","global", "lds", +"scratch", "gds","other"}; + return Names[Space]; +} + +bool parseOneAddrSpace(StringRef Src, unsigned &AddrSpace) { + if (Src == "none") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::NONE; +return true; + } + if (Src == "flat") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::FLAT; +return true; + } + if (Src == "atomic") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ATOMIC; +return true; + } + if (Src == "all") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ALL; +return true; + } + for (unsigned I = 1, A = 1; A <= (unsigned)AMDGPU::SIAtomicAddrSpace::LAST; + A <<= 1, ++I) { +if (Src == fmtAddrSpace(I)) { + AddrSpace = A; + return true; +} + } + AddrSpace = ~0u; + return false; +} + +bool parseAddrSpace(StringRef Src, unsigned &AddrSpace) { + Src = Src.trim(); + Src.consume_front("."); + while (!Src.empty()) { +auto [First, Rest] = Src.split('.')
[llvm-branch-commits] [llvm] [AMDGPU] wip: MIR pretty printing for S_WAITCNT_FENCE_soft (PR #150391)
llvmbot wrote: @llvm/pr-subscribers-llvm-globalisel Author: Sameer Sahasrabuddhe (ssahasra) Changes --- Patch is 34.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150391.diff 7 Files Affected: - (modified) llvm/lib/CodeGen/MIRParser/MIParser.cpp (+10-15) - (modified) llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp (+161) - (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+6-2) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll (+36-36) - (added) llvm/test/CodeGen/AMDGPU/fence-parameters.mir (+29) - (modified) llvm/test/CodeGen/AMDGPU/insert-waitcnts-fence-soft.mir (+9-9) - (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local.mir (+12-12) ``diff diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp index 3a364d5ff0d20..c8ad286a87a35 100644 --- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp +++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp @@ -1850,28 +1850,25 @@ bool MIParser::parseImmediateOperand(MachineOperand &Dest) { return false; } +// The target mnemonic is an expression of the form: +// +// Dot(IntegerLiteral|Identifier|Dot)+ +// +// We could be stricter like not terminating in a dot, but that's note important +// where this is being used. bool MIParser::parseTargetImmMnemonic(const unsigned OpCode, const unsigned OpIdx, MachineOperand &Dest, const MIRFormatter &MF) { assert(Token.is(MIToken::dot)); auto Loc = Token.location(); // record start position - size_t Len = 1; // for "." - lex(); - - // Handle the case that mnemonic starts with number. - if (Token.is(MIToken::IntegerLiteral)) { + size_t Len = 0; + while (Token.is(MIToken::IntegerLiteral) || Token.is(MIToken::dot) || + Token.is(MIToken::Identifier)) { Len += Token.range().size(); lex(); } - - StringRef Src; - if (Token.is(MIToken::comma)) -Src = StringRef(Loc, Len); - else { -assert(Token.is(MIToken::Identifier)); -Src = StringRef(Loc, Len + Token.stringValue().size()); - } + StringRef Src(Loc, Len); int64_t Val; if (MF.parseImmMnemonic(OpCode, OpIdx, Src, Val, [this](StringRef::iterator Loc, const Twine &Msg) @@ -1879,8 +1876,6 @@ bool MIParser::parseTargetImmMnemonic(const unsigned OpCode, return true; Dest = MachineOperand::CreateImm(Val); - if (!Token.is(MIToken::comma)) -lex(); return false; } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp index 75e3d8c426e73..f318d6ffc1bae 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp @@ -12,10 +12,135 @@ //===--===// #include "AMDGPUMIRFormatter.h" +#include "SIDefines.h" #include "SIMachineFunctionInfo.h" using namespace llvm; +bool parseAtomicOrdering(StringRef Src, unsigned &Order) { + Src.consume_front("."); + for (unsigned I = 0; I <= (unsigned)AtomicOrdering::LAST; ++I) { +if (Src == toIRString((AtomicOrdering)I)) { + Order = I; + return true; +} + } + Order = ~0u; + return false; +} + +static const char *fmtScope(unsigned Scope) { + static const char *Names[] = {"none", "singlethread", "wavefront", +"workgroup", "agent","system"}; + return Names[Scope]; +} + +bool parseAtomicScope(StringRef Src, unsigned &Scope) { + Src.consume_front("."); + for (unsigned I = 0; + I != (unsigned)AMDGPU::SIAtomicScope::NUM_SI_ATOMIC_SCOPES; ++I) { +if (Src == fmtScope(I)) { + Scope = I; + return true; +} + } + Scope = ~0u; + return false; +} + +static const char *fmtAddrSpace(unsigned Space) { + static const char *Names[] = {"none","global", "lds", +"scratch", "gds","other"}; + return Names[Space]; +} + +bool parseOneAddrSpace(StringRef Src, unsigned &AddrSpace) { + if (Src == "none") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::NONE; +return true; + } + if (Src == "flat") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::FLAT; +return true; + } + if (Src == "atomic") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ATOMIC; +return true; + } + if (Src == "all") { +AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ALL; +return true; + } + for (unsigned I = 1, A = 1; A <= (unsigned)AMDGPU::SIAtomicAddrSpace::LAST; + A <<= 1, ++I) { +if (Src == fmtAddrSpace(I)) { + AddrSpace = A; + return true; +} + } + AddrSpace = ~0u; + return false; +} + +bool parseAddrSpace(StringRef Src, unsigned &AddrSpace) { + Src = Src.trim(); + Src.consume_front("."); + while (!Src.empty()) { +auto [First, Rest] = Src.split('.'
[llvm-branch-commits] [llvm] [AMDGPU] wip: MIR pretty printing for S_WAITCNT_FENCE_soft (PR #150391)
https://github.com/ssahasra edited https://github.com/llvm/llvm-project/pull/150391 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DTLTO] Add LLVM release note for LLVM 21 (PR #150171)
https://github.com/tru approved this pull request. https://github.com/llvm/llvm-project/pull/150171 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [Flang] Fix a crash when equivalence and namelist statements are used (PR #150292)
tru wrote: Who can review this? https://github.com/llvm/llvm-project/pull/150292 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][deps] Add a release note for fixing crashes in `clang-scan-deps`. (#149857) (PR #150329)
https://github.com/tru approved this pull request. https://github.com/llvm/llvm-project/pull/150329 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/21.x: [KeyInstr] Fix verifier check (#149043) (PR #149053)
tru wrote: > As for the pre-merge on this one: the abi-compare bot thing seems cool, > though I don't think the reported failure is for this patch, I've not touched > any function signatures here Yeah it's not correct until we made the RC1 release. If this branch is rebased it will pass. But I think it's fine enough to merge this at this point. https://github.com/llvm/llvm-project/pull/149053 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] release/21.x: [lld] Add thunks for hexagon (#111217) (PR #149723)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/149723 >From 760616dcfde320a2653eab10c5c6a377d9c986c8 Mon Sep 17 00:00:00 2001 From: Brian Cain Date: Sun, 20 Jul 2025 11:46:31 -0500 Subject: [PATCH] [lld] Add thunks for hexagon (#111217) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Without thunks, programs will encounter link errors complaining that the branch target is out of range. Thunks will extend the range of branch targets, which is a critical need for large programs. Thunks provide this flexibility at a cost of some modest code size increase. When configured with the maximal feature set, the hexagon port of the linux kernel would often encounter these limitations when linking with `lld`. The relocations which will be extended by thunks are: * R_HEX_B22_PCREL, R_HEX_{G,L}D_PLT_B22_PCREL, R_HEX_PLT_B22_PCREL relocations have a range of ± 8MiB on the baseline * R_HEX_B15_PCREL: ±65,532 bytes * R_HEX_B13_PCREL: ±16,380 bytes * R_HEX_B9_PCREL: ±1,020 bytes Fixes #149689 Co-authored-by: Alexey Karyakin - Co-authored-by: Alexey Karyakin (cherry picked from commit b42f96bc057fd9e31572069b241ba130c21144e5) --- lld/ELF/Arch/Hexagon.cpp | 47 + lld/ELF/Relocations.cpp | 53 +++--- lld/ELF/Thunks.cpp| 72 - lld/test/ELF/hexagon-jump-error.s | 32 -- lld/test/ELF/hexagon-thunk-range-b22rel.s | 115 lld/test/ELF/hexagon-thunk-range-gdplt.s | 95 + lld/test/ELF/hexagon-thunk-range-plt.s| 75 + lld/test/ELF/hexagon-thunks-packets.s | 122 ++ lld/test/ELF/hexagon-thunks.s | 53 ++ 9 files changed, 618 insertions(+), 46 deletions(-) delete mode 100644 lld/test/ELF/hexagon-jump-error.s create mode 100644 lld/test/ELF/hexagon-thunk-range-b22rel.s create mode 100644 lld/test/ELF/hexagon-thunk-range-gdplt.s create mode 100644 lld/test/ELF/hexagon-thunk-range-plt.s create mode 100644 lld/test/ELF/hexagon-thunks-packets.s create mode 100644 lld/test/ELF/hexagon-thunks.s diff --git a/lld/ELF/Arch/Hexagon.cpp b/lld/ELF/Arch/Hexagon.cpp index 479131a24dcfc..9b33e78731c97 100644 --- a/lld/ELF/Arch/Hexagon.cpp +++ b/lld/ELF/Arch/Hexagon.cpp @@ -11,6 +11,7 @@ #include "Symbols.h" #include "SyntheticSections.h" #include "Target.h" +#include "Thunks.h" #include "lld/Common/ErrorHandler.h" #include "llvm/ADT/SmallVector.h" #include "llvm/BinaryFormat/ELF.h" @@ -36,6 +37,10 @@ class Hexagon final : public TargetInfo { const uint8_t *loc) const override; RelType getDynRel(RelType type) const override; int64_t getImplicitAddend(const uint8_t *buf, RelType type) const override; + bool needsThunk(RelExpr expr, RelType type, const InputFile *file, + uint64_t branchAddr, const Symbol &s, + int64_t a) const override; + bool inBranchRange(RelType type, uint64_t src, uint64_t dst) const override; void relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const override; void writePltHeader(uint8_t *buf) const override; @@ -63,6 +68,8 @@ Hexagon::Hexagon(Ctx &ctx) : TargetInfo(ctx) { tlsGotRel = R_HEX_TPREL_32; tlsModuleIndexRel = R_HEX_DTPMOD_32; tlsOffsetRel = R_HEX_DTPREL_32; + + needsThunks = true; } uint32_t Hexagon::calcEFlags() const { @@ -258,6 +265,46 @@ static uint32_t findMaskR16(Ctx &ctx, uint32_t insn) { static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); } +bool Hexagon::inBranchRange(RelType type, uint64_t src, uint64_t dst) const { + int64_t offset = dst - src; + switch (type) { + case llvm::ELF::R_HEX_B22_PCREL: + case llvm::ELF::R_HEX_PLT_B22_PCREL: + case llvm::ELF::R_HEX_GD_PLT_B22_PCREL: + case llvm::ELF::R_HEX_LD_PLT_B22_PCREL: +return llvm::isInt<22>(offset >> 2); + case llvm::ELF::R_HEX_B15_PCREL: +return llvm::isInt<15>(offset >> 2); +break; + case llvm::ELF::R_HEX_B13_PCREL: +return llvm::isInt<13>(offset >> 2); +break; + case llvm::ELF::R_HEX_B9_PCREL: +return llvm::isInt<9>(offset >> 2); + default: +return true; + } + llvm_unreachable("unsupported relocation"); +} + +bool Hexagon::needsThunk(RelExpr expr, RelType type, const InputFile *file, + uint64_t branchAddr, const Symbol &s, + int64_t a) const { + // Only check branch range for supported branch relocation types + switch (type) { + case R_HEX_B22_PCREL: + case R_HEX_PLT_B22_PCREL: + case R_HEX_GD_PLT_B22_PCREL: + case R_HEX_LD_PLT_B22_PCREL: + case R_HEX_B15_PCREL: + case R_HEX_B13_PCREL: + case R_HEX_B9_PCREL: +return !ctx.target->inBranchRange(type, branchAddr, s.getVA(ctx, a)); + default: +return false; + } +} + void Hexagon::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
[llvm-branch-commits] [llvm] Propagate Constants for Wave Reduction Intrinsics (PR #150395)
easyonaadit wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/150395?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#150395** https://app.graphite.dev/github/pr/llvm/llvm-project/150395?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/150395?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#150170** https://app.graphite.dev/github/pr/llvm/llvm-project/150170?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#150169** https://app.graphite.dev/github/pr/llvm/llvm-project/150169?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/150395 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/21.x: [compiler-rt][Mips] Fix stat size check on mips64 musl (#143301) (PR #149683)
github-actions[bot] wrote: @brad0 (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/149683 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239) (PR #149736)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/149736 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239) (PR #149736)
tru wrote: Closed in favor of #150193 https://github.com/llvm/llvm-project/pull/149736 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) (PR #149777)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/149777 >From 6dde08705669b8579694aee5b5c6acbb5bdbb492 Mon Sep 17 00:00:00 2001 From: tangaac Date: Fri, 18 Jul 2025 16:12:11 +0800 Subject: [PATCH] [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) This patch adds an emergency spill slot when ran out of registers. PR #139201 introduces `vstelm` instructions with only 8-bit imm offset, it causes no spill slot to store the spill registers. (cherry picked from commit 64a0478e08829ec6bcae2b05e154aa58c2c46ac0) --- .../LoongArch/LoongArchFrameLowering.cpp | 7 +- .../CodeGen/LoongArch/calling-conv-common.ll | 48 +-- .../CodeGen/LoongArch/calling-conv-half.ll| 16 +- .../LoongArch/can-not-realign-stack.ll| 44 +-- .../CodeGen/LoongArch/emergency-spill-slot.ll | 4 +- llvm/test/CodeGen/LoongArch/frame.ll | 107 ++- .../CodeGen/LoongArch/intrinsic-memcpy.ll | 8 +- llvm/test/CodeGen/LoongArch/lasx/fpowi.ll | 88 +++--- .../lasx/ir-instruction/extractelement.ll | 120 .../ir-instruction/insert-extract-element.ll | 40 +-- .../insert-extract-pair-elements.ll | 40 +-- .../lasx/ir-instruction/insertelement.ll | 132 llvm/test/CodeGen/LoongArch/llvm.sincos.ll| 150 - llvm/test/CodeGen/LoongArch/lsx/pr146455.ll | 287 ++ ...realignment-with-variable-sized-objects.ll | 24 +- .../CodeGen/LoongArch/stack-realignment.ll| 80 ++--- .../LoongArch/unaligned-memcpy-inline.ll | 14 +- llvm/test/CodeGen/LoongArch/vararg.ll | 70 ++--- 18 files changed, 823 insertions(+), 456 deletions(-) create mode 100644 llvm/test/CodeGen/LoongArch/lsx/pr146455.ll diff --git a/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp b/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp index ac5e7f3891c72..1493bf4cba695 100644 --- a/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp +++ b/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp @@ -158,7 +158,12 @@ void LoongArchFrameLowering::processFunctionBeforeFrameFinalized( // estimateStackSize has been observed to under-estimate the final stack // size, so give ourselves wiggle-room by checking for stack size // representable an 11-bit signed field rather than 12-bits. - if (!isInt<11>(MFI.estimateStackSize(MF))) + // For [x]vstelm.{b/h/w/d} memory instructions with 8 imm offset, 7-bit + // signed field is fine. + unsigned EstimateStackSize = MFI.estimateStackSize(MF); + if (!isInt<11>(EstimateStackSize) || + (MF.getSubtarget().hasExtLSX() && + !isInt<7>(EstimateStackSize))) ScavSlotsNum = std::max(ScavSlotsNum, 1u); // For CFR spill. diff --git a/llvm/test/CodeGen/LoongArch/calling-conv-common.ll b/llvm/test/CodeGen/LoongArch/calling-conv-common.ll index d07e2914c753a..f7653af1fa9ba 100644 --- a/llvm/test/CodeGen/LoongArch/calling-conv-common.ll +++ b/llvm/test/CodeGen/LoongArch/calling-conv-common.ll @@ -122,23 +122,23 @@ define i64 @callee_large_scalars(i256 %a, i256 %b) nounwind { define i64 @caller_large_scalars() nounwind { ; CHECK-LABEL: caller_large_scalars: ; CHECK: # %bb.0: -; CHECK-NEXT:addi.d $sp, $sp, -80 -; CHECK-NEXT:st.d $ra, $sp, 72 # 8-byte Folded Spill -; CHECK-NEXT:st.d $zero, $sp, 24 +; CHECK-NEXT:addi.d $sp, $sp, -96 +; CHECK-NEXT:st.d $ra, $sp, 88 # 8-byte Folded Spill +; CHECK-NEXT:st.d $zero, $sp, 40 ; CHECK-NEXT:vrepli.b $vr0, 0 -; CHECK-NEXT:vst $vr0, $sp, 8 +; CHECK-NEXT:vst $vr0, $sp, 24 ; CHECK-NEXT:ori $a0, $zero, 2 -; CHECK-NEXT:st.d $a0, $sp, 0 -; CHECK-NEXT:st.d $zero, $sp, 56 -; CHECK-NEXT:vst $vr0, $sp, 40 +; CHECK-NEXT:st.d $a0, $sp, 16 +; CHECK-NEXT:st.d $zero, $sp, 72 +; CHECK-NEXT:vst $vr0, $sp, 56 ; CHECK-NEXT:ori $a2, $zero, 1 -; CHECK-NEXT:addi.d $a0, $sp, 32 -; CHECK-NEXT:addi.d $a1, $sp, 0 -; CHECK-NEXT:st.d $a2, $sp, 32 +; CHECK-NEXT:addi.d $a0, $sp, 48 +; CHECK-NEXT:addi.d $a1, $sp, 16 +; CHECK-NEXT:st.d $a2, $sp, 48 ; CHECK-NEXT:pcaddu18i $ra, %call36(callee_large_scalars) ; CHECK-NEXT:jirl $ra, $ra, 0 -; CHECK-NEXT:ld.d $ra, $sp, 72 # 8-byte Folded Reload -; CHECK-NEXT:addi.d $sp, $sp, 80 +; CHECK-NEXT:ld.d $ra, $sp, 88 # 8-byte Folded Reload +; CHECK-NEXT:addi.d $sp, $sp, 96 ; CHECK-NEXT:ret %1 = call i64 @callee_large_scalars(i256 1, i256 2) ret i64 %1 @@ -177,20 +177,20 @@ define i64 @callee_large_scalars_exhausted_regs(i64 %a, i64 %b, i64 %c, i64 %d, define i64 @caller_large_scalars_exhausted_regs() nounwind { ; CHECK-LABEL: caller_large_scalars_exhausted_regs: ; CHECK: # %bb.0: -; CHECK-NEXT:addi.d $sp, $sp, -96 -; CHECK-NEXT:st.d $ra, $sp, 88 # 8-byte Folded Spill -; CHECK-NEXT:addi.d $a0, $sp, 16 +; CHECK-NEXT:addi.d $sp, $sp, -112 +; CHECK-NEXT:st.d $ra, $sp, 104 # 8-byte Folded Spill +; CHECK-NEXT:addi.d $a0, $sp, 32 ; CHECK-NE
[llvm-branch-commits] [llvm] release/21.x: [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) (PR #149777)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/149777 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) (PR #149777)
github-actions[bot] wrote: @tangaac (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/149777 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [AArch64, TTI] Disable RealUse check for vector insert/extract costs and Apple CPUs. (#146526) (PR #149815)
tru wrote: Can you update the PR description to match the reality and I can merge this after that. https://github.com/llvm/llvm-project/pull/149815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Propagate Constants for Wave Reduction Intrinsics (PR #150395)
https://github.com/easyonaadit edited https://github.com/llvm/llvm-project/pull/150395 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [MachinePipeliner] Fix incorrect dependency direction (#149436) (PR #149950)
tru wrote: Can you rebase and squash this PR so that I won't merge a merge commit. https://github.com/llvm/llvm-project/pull/149950 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/21.x: [Flang] Fix ASSIGN statement (#149941) (PR #150228)
github-actions[bot] wrote: @ceseo (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/150228 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] release/21.x: [libc++] Fix hash_multi{map, set}::insert (#149290) (PR #149435)
https://github.com/tru updated https://github.com/llvm/llvm-project/pull/149435 >From 4a4071dc71d87357ea27e81bf46078e03ca9630e Mon Sep 17 00:00:00 2001 From: Nikolas Klauser Date: Thu, 17 Jul 2025 23:23:04 +0200 Subject: [PATCH] [libc++] Fix hash_multi{map,set}::insert (#149290) (cherry picked from commit be3d614cc13f016b16634e18e10caed508d183d2) --- libcxx/include/ext/hash_map | 4 +-- libcxx/include/ext/hash_set | 4 +-- .../gnu/hash_multimap/insert.pass.cpp | 35 +++ .../gnu/hash_multiset/insert.pass.cpp | 35 +++ 4 files changed, 74 insertions(+), 4 deletions(-) create mode 100644 libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp create mode 100644 libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp diff --git a/libcxx/include/ext/hash_map b/libcxx/include/ext/hash_map index d6b92204f4376..46815eaffa8bd 100644 --- a/libcxx/include/ext/hash_map +++ b/libcxx/include/ext/hash_map @@ -744,7 +744,7 @@ public: _LIBCPP_HIDE_FROM_ABI const_iterator begin() const { return __table_.begin(); } _LIBCPP_HIDE_FROM_ABI const_iterator end() const { return __table_.end(); } - _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return __table_.__emplace_unique(__x); } + _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return __table_.__emplace_multi(__x); } _LIBCPP_HIDE_FROM_ABI iterator insert(const_iterator, const value_type& __x) { return insert(__x); } template _LIBCPP_HIDE_FROM_ABI void insert(_InputIterator __first, _InputIterator __last); @@ -831,7 +831,7 @@ template template inline void hash_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>::insert(_InputIterator __first, _InputIterator __last) { for (; __first != __last; ++__first) -__table_.__emplace_unique(*__first); +__table_.__emplace_multi(*__first); } template diff --git a/libcxx/include/ext/hash_set b/libcxx/include/ext/hash_set index 7fd5df24ed3a8..62a7a0dbcffb9 100644 --- a/libcxx/include/ext/hash_set +++ b/libcxx/include/ext/hash_set @@ -458,7 +458,7 @@ public: _LIBCPP_HIDE_FROM_ABI const_iterator begin() const { return __table_.begin(); } _LIBCPP_HIDE_FROM_ABI const_iterator end() const { return __table_.end(); } - _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return __table_.__emplace_unique(__x); } + _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return __table_.__emplace_multi(__x); } _LIBCPP_HIDE_FROM_ABI iterator insert(const_iterator, const value_type& __x) { return insert(__x); } template _LIBCPP_HIDE_FROM_ABI void insert(_InputIterator __first, _InputIterator __last); @@ -543,7 +543,7 @@ template template inline void hash_multiset<_Value, _Hash, _Pred, _Alloc>::insert(_InputIterator __first, _InputIterator __last) { for (; __first != __last; ++__first) -__table_.__emplace_unique(*__first); +__table_.__emplace_multi(*__first); } template diff --git a/libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp b/libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp new file mode 100644 index 0..ea80359f1fea2 --- /dev/null +++ b/libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp @@ -0,0 +1,35 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +// ADDITIONAL_COMPILE_FLAGS: -Wno-deprecated + +// hash_multimap::insert + +#include +#include + +int main(int, char**) { + __gnu_cxx::hash_multimap map; + + map.insert(std::make_pair(1, 1)); + map.insert(std::make_pair(1, 1)); + + assert(map.size() == 2); + assert(map.equal_range(1).first == map.begin()); + assert(map.equal_range(1).second == map.end()); + + std::pair arr[] = {std::make_pair(1, 1), std::make_pair(1, 1)}; + + map.insert(arr, arr + 2); + + assert(map.size() == 4); + assert(map.equal_range(1).first == map.begin()); + assert(map.equal_range(1).second == map.end()); + + return 0; +} diff --git a/libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp b/libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp new file mode 100644 index 0..1a60cac158a40 --- /dev/null +++ b/libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp @@ -0,0 +1,35 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +// ADDITIONAL_COMPILE_FLAGS: -Wno-deprecated + +// hash_multimap
[llvm-branch-commits] [clang] [clang][deps] Add a release note for fixing crashes in `clang-scan-deps`. (#149857) (PR #150329)
github-actions[bot] wrote: @vsapsai (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/150329 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang][deps] Add a release note for fixing crashes in `clang-scan-deps`. (#149857) (PR #150329)
https://github.com/tru closed https://github.com/llvm/llvm-project/pull/150329 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] release/21.x: [flang][OpenMP] Restore reduction processor behavior broken by #145837 (#150178) (PR #150200)
github-actions[bot] wrote: @ergawy (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/150200 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)
llvmbot wrote: @llvm/pr-subscribers-mc Author: None (llvmbot) Changes Backport a073cbbb1aeaaeac01b12e818fe47e4c04080aac Requested by: @phoebewang --- Full diff: https://github.com/llvm/llvm-project/pull/150402.diff 2 Files Affected: - (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+25-3) - (added) llvm/test/MC/X86/intel-syntax-parentheses.s (+10) ``diff diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp index b642c1cfe383b..8213e512f45e1 100644 --- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp +++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp @@ -1042,8 +1042,8 @@ class X86AsmParser : public MCTargetAsmParser { } PrevState = CurrState; } -void onRParen() { - PrevState = State; +bool onRParen(StringRef &ErrMsg) { + IntelExprState CurrState = State; switch (State) { default: State = IES_ERROR; @@ -1054,9 +1054,27 @@ class X86AsmParser : public MCTargetAsmParser { case IES_RBRAC: case IES_RPAREN: State = IES_RPAREN; +// In the case of a multiply, onRegister has already set IndexReg +// directly, with appropriate scale. +// Otherwise if we just saw a register it has only been stored in +// TmpReg, so we need to store it into the state machine. +if (CurrState == IES_REGISTER && PrevState != IES_MULTIPLY) { + // If we already have a BaseReg, then assume this is the IndexReg with + // no explicit scale. + if (!BaseReg) { +BaseReg = TmpReg; + } else { +if (IndexReg) + return regsUseUpError(ErrMsg); +IndexReg = TmpReg; +Scale = 0; + } +} IC.pushOperator(IC_RPAREN); break; } + PrevState = CurrState; + return false; } bool onOffset(const MCExpr *Val, SMLoc OffsetLoc, StringRef ID, const InlineAsmIdentifierInfo &IDInfo, @@ -2172,7 +2190,11 @@ bool X86AsmParser::ParseIntelExpression(IntelExprStateMachine &SM, SMLoc &End) { } break; case AsmToken::LParen: SM.onLParen(); break; -case AsmToken::RParen: SM.onRParen(); break; +case AsmToken::RParen: + if (SM.onRParen(ErrMsg)) { +return Error(Tok.getLoc(), ErrMsg); + } + break; } if (SM.hadError()) return Error(Tok.getLoc(), "unknown token in expression"); diff --git a/llvm/test/MC/X86/intel-syntax-parentheses.s b/llvm/test/MC/X86/intel-syntax-parentheses.s new file mode 100644 index 0..ae53f64089070 --- /dev/null +++ b/llvm/test/MC/X86/intel-syntax-parentheses.s @@ -0,0 +1,10 @@ +// RUN: not llvm-mc -triple x86_64-unknown-unknown %s 2>&1 | FileCheck %s + +.intel_syntax + +// CHECK: error: invalid base+index expression +lea rdi, [(label + rsi) + rip] +// CHECK: leaq1(%rax,%rdi), %rdi +lea rdi, [(rax + rdi) + 1] +label: +.quad 42 `` https://github.com/llvm/llvm-project/pull/150402 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/150402 Backport a073cbbb1aeaaeac01b12e818fe47e4c04080aac Requested by: @phoebewang >From b9b8c95fea2cfa8848cdbd2418db41bfafa8706d Mon Sep 17 00:00:00 2001 From: circuit10 Date: Thu, 24 Jul 2025 10:38:16 +0100 Subject: [PATCH] [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) This fixes #116883. The x86 parser saves any register it encounters to a TmpReg field in its state machine, then on encountering the next valid token immediately afterwards saves it to either BaseReg, or IndexReg if BaseReg was already filled. However, this saving logic was missing on the RParen token handler, causing the parser to "forget" the register immediately beforehand. This also would prevent later validation logic from detecting the addressing mode as invalid, leading to a silent misassembly rather than an error. (cherry picked from commit a073cbbb1aeaaeac01b12e818fe47e4c04080aac) --- .../lib/Target/X86/AsmParser/X86AsmParser.cpp | 28 +-- llvm/test/MC/X86/intel-syntax-parentheses.s | 10 +++ 2 files changed, 35 insertions(+), 3 deletions(-) create mode 100644 llvm/test/MC/X86/intel-syntax-parentheses.s diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp index b642c1cfe383b..8213e512f45e1 100644 --- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp +++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp @@ -1042,8 +1042,8 @@ class X86AsmParser : public MCTargetAsmParser { } PrevState = CurrState; } -void onRParen() { - PrevState = State; +bool onRParen(StringRef &ErrMsg) { + IntelExprState CurrState = State; switch (State) { default: State = IES_ERROR; @@ -1054,9 +1054,27 @@ class X86AsmParser : public MCTargetAsmParser { case IES_RBRAC: case IES_RPAREN: State = IES_RPAREN; +// In the case of a multiply, onRegister has already set IndexReg +// directly, with appropriate scale. +// Otherwise if we just saw a register it has only been stored in +// TmpReg, so we need to store it into the state machine. +if (CurrState == IES_REGISTER && PrevState != IES_MULTIPLY) { + // If we already have a BaseReg, then assume this is the IndexReg with + // no explicit scale. + if (!BaseReg) { +BaseReg = TmpReg; + } else { +if (IndexReg) + return regsUseUpError(ErrMsg); +IndexReg = TmpReg; +Scale = 0; + } +} IC.pushOperator(IC_RPAREN); break; } + PrevState = CurrState; + return false; } bool onOffset(const MCExpr *Val, SMLoc OffsetLoc, StringRef ID, const InlineAsmIdentifierInfo &IDInfo, @@ -2172,7 +2190,11 @@ bool X86AsmParser::ParseIntelExpression(IntelExprStateMachine &SM, SMLoc &End) { } break; case AsmToken::LParen: SM.onLParen(); break; -case AsmToken::RParen: SM.onRParen(); break; +case AsmToken::RParen: + if (SM.onRParen(ErrMsg)) { +return Error(Tok.getLoc(), ErrMsg); + } + break; } if (SM.hadError()) return Error(Tok.getLoc(), "unknown token in expression"); diff --git a/llvm/test/MC/X86/intel-syntax-parentheses.s b/llvm/test/MC/X86/intel-syntax-parentheses.s new file mode 100644 index 0..ae53f64089070 --- /dev/null +++ b/llvm/test/MC/X86/intel-syntax-parentheses.s @@ -0,0 +1,10 @@ +// RUN: not llvm-mc -triple x86_64-unknown-unknown %s 2>&1 | FileCheck %s + +.intel_syntax + +// CHECK: error: invalid base+index expression +lea rdi, [(label + rsi) + rip] +// CHECK: leaq1(%rax,%rdi), %rdi +lea rdi, [(rax + rdi) + 1] +label: +.quad 42 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/150402 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)
llvmbot wrote: @phoebewang What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/150402 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/21.x: [KeyInstr] Inline asm atoms (#149076) (PR #150056)
https://github.com/jmorse approved this pull request. LGTM, and completes key-instr related things in llvm21. https://github.com/llvm/llvm-project/pull/150056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/banach-space edited https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
banach-space wrote: Should we also update: ``` - The following relationship for the tiled dimensions holds: `shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] / inner_tiles[i]`. ``` as ``` - The following relationship for the tiled dimensions holds: `shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] ⌈/⌉ inner_tiles[i]` (⌈/⌉ - CeilDiv). ``` https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)
https://github.com/hanhanW closed https://github.com/llvm/llvm-project/pull/149624 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Require CFG in BAT mode (PR #150488)
https://github.com/aaupov updated https://github.com/llvm/llvm-project/pull/150488 >From faf7d914093c87804e9dbca349b1a2bca0aefd18 Mon Sep 17 00:00:00 2001 From: Amir Ayupov Date: Thu, 24 Jul 2025 13:56:18 -0700 Subject: [PATCH] updated test Created using spr 1.3.4 --- bolt/test/X86/unclaimed-jt-entries.s | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/bolt/test/X86/unclaimed-jt-entries.s b/bolt/test/X86/unclaimed-jt-entries.s index 1102e4ae413e2..b5c5abfbedebc 100644 --- a/bolt/test/X86/unclaimed-jt-entries.s +++ b/bolt/test/X86/unclaimed-jt-entries.s @@ -18,6 +18,16 @@ # RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown %s -o %t.o # RUN: %clang %cflags -no-pie %t.o -o %t.exe -Wl,-q + +## Check that non-simple function profile is emitted in perf2bolt mode +# RUN: link_fdata %s %t.exe %t.pa PREAGG +# RUN: llvm-strip -N L5 -N L5_ret %t.exe +# RUN: perf2bolt %t.exe -p %t.pa --pa -o %t.fdata -strict=0 -print-profile \ +# RUN: -print-only=main | FileCheck %s --check-prefix=CHECK-P2B +# CHECK-P2B: PERF2BOLT: traces mismatching disassembled function contents: 0 +# CHECK-P2B: Binary Function "main" +# CHECK-P2B: IsSimple : 0 + # RUN: llvm-bolt %t.exe -v=1 -o %t.out 2>&1 | FileCheck %s # CHECK: BOLT-WARNING: unclaimed data to code reference (possibly an unrecognized jump table entry) to .Ltmp[[#]] in main @@ -33,8 +43,10 @@ .size main, .Lend-main main: jmp *L4-24(,%rdi,8) -.L5: +# PREAGG: T #main# #L5# #L5_ret# 1 +L5: movl $4, %eax +L5_ret: ret .L9: movl $2, %eax @@ -58,7 +70,7 @@ L4: .quad .L3 .quad .L6 .quad .L3 - .quad .L5 + .quad L5 .quad .L3 .quad .L3 .quad .L3 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Require CFG in BAT mode (PR #150488)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/150488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] 32c9e86 - Revert "[flang][flang-driver][mlir][OpenMP] atomic control support (#143441)"
Author: Kiran Chandramohan Date: 2025-07-24T20:33:43+01:00 New Revision: 32c9e86d027efc84ba696a38ef626ae04d306ec0 URL: https://github.com/llvm/llvm-project/commit/32c9e86d027efc84ba696a38ef626ae04d306ec0 DIFF: https://github.com/llvm/llvm-project/commit/32c9e86d027efc84ba696a38ef626ae04d306ec0.diff LOG: Revert "[flang][flang-driver][mlir][OpenMP] atomic control support (#143441)" This reverts commit f44346dc1f6252716cfc62bb0687e3932a93089f. Added: Modified: clang/include/clang/Driver/Options.td flang/include/flang/Frontend/TargetOptions.h flang/include/flang/Optimizer/Dialect/Support/FIRContext.h flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Lower/Bridge.cpp flang/lib/Lower/OpenMP/Atomic.cpp flang/lib/Optimizer/Dialect/Support/FIRContext.cpp mlir/include/mlir/Dialect/OpenMP/OpenMPAttrDefs.td mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td mlir/test/Dialect/OpenMP/ops.mlir Removed: flang/test/Lower/OpenMP/atomic-control-options.f90 diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index fa248381583cd..916400efdb449 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -2320,21 +2320,21 @@ def fsymbol_partition_EQ : Joined<["-"], "fsymbol-partition=">, Group, defm atomic_remote_memory : BoolFOption<"atomic-remote-memory", LangOpts<"AtomicRemoteMemory">, DefaultFalse, - PosFlag, - NegFlag, - BothFlags<[], [ClangOption, FlangOption], " atomic operations on remote memory">>; + PosFlag, + NegFlag, + BothFlags<[], [ClangOption], " atomic operations on remote memory">>; defm atomic_fine_grained_memory : BoolFOption<"atomic-fine-grained-memory", LangOpts<"AtomicFineGrainedMemory">, DefaultFalse, - PosFlag, - NegFlag, - BothFlags<[], [ClangOption, FlangOption], " atomic operations on fine-grained memory">>; + PosFlag, + NegFlag, + BothFlags<[], [ClangOption], " atomic operations on fine-grained memory">>; defm atomic_ignore_denormal_mode : BoolFOption<"atomic-ignore-denormal-mode", LangOpts<"AtomicIgnoreDenormalMode">, DefaultFalse, - PosFlag, - NegFlag, - BothFlags<[], [ClangOption, FlangOption], " atomic operations to ignore denormal mode">>; + PosFlag, + NegFlag, + BothFlags<[], [ClangOption], " atomic operations to ignore denormal mode">>; defm memory_profile : OptInCC1FFlag<"memory-profile", "Enable", "Disable", " heap memory profiling">; def fmemory_profile_EQ : Joined<["-"], "fmemory-profile=">, @@ -5360,9 +5360,9 @@ defm amdgpu_precise_memory_op " precise memory mode (AMDGPU only)">; def munsafe_fp_atomics : Flag<["-"], "munsafe-fp-atomics">, - Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, Alias; + Visibility<[ClangOption, CC1Option]>, Alias; def mno_unsafe_fp_atomics : Flag<["-"], "mno-unsafe-fp-atomics">, - Visibility<[ClangOption, FlangOption]>, Alias; + Visibility<[ClangOption]>, Alias; def faltivec : Flag<["-"], "faltivec">, Group; def fno_altivec : Flag<["-"], "fno-altivec">, Group; diff --git a/flang/include/flang/Frontend/TargetOptions.h b/flang/include/flang/Frontend/TargetOptions.h index f6e5634d5a995..002d8d158abd4 100644 --- a/flang/include/flang/Frontend/TargetOptions.h +++ b/flang/include/flang/Frontend/TargetOptions.h @@ -53,11 +53,6 @@ class TargetOptions { /// Print verbose assembly bool asmVerbose = false; - - /// Atomic control options - bool atomicIgnoreDenormalMode = false; - bool atomicRemoteMemory = false; - bool atomicFineGrainedMemory = false; }; } // end namespace Fortran::frontend diff --git a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h index c0c0b744206cd..2df14f83c11e1 100644 --- a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h +++ b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h @@ -58,25 +58,6 @@ void setTargetCPU(mlir::ModuleOp mod, llvm::StringRef cpu); /// Get the target CPU string from the Module or return a null reference. llvm::StringRef getTargetCPU(mlir::ModuleOp mod); -/// Sets whether Denormal Mode can be ignored or not for lowering of floating -/// point atomic operations. -void setAtomicIgnoreDenormalMode(mlir::ModuleOp mod, bool value); -/// Gets whether Denormal Mode can be ignored or not for lowering of floating -/// point atomic operations. -bool getAtomicIgnoreDenormalMode(mlir::ModuleOp mod); -/// Sets whether fine grained memory can be used or not for lowering of atomic -/// operations. -void setAtomicFineGrainedMemory(mlir::ModuleOp mod, bool value); -/// Gets whether fine grained memory can be used or not for lowering of atomic -/// operations. -bool getAtomicFineGrainedMemory(mlir::ModuleOp mod); -/// Sets whether remote memory can be used or not for lowering of atomic -/// operations. -void setAtomicRemoteMe
[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)
https://github.com/teresajohnson approved this pull request. https://github.com/llvm/llvm-project/pull/150506 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/21.x: [RISCV] Pass sign-extended value to isInt check in expandMul (#150211) (PR #150556)
https://github.com/lenary approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/150556 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Handle rewriting non-tied MFMA to AGPR form (PR #149027)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/149027 >From bcdb0d78fe8c227e7b2c9b539db496950332f66b Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Jul 2025 12:57:13 +0900 Subject: [PATCH] AMDGPU: Handle rewriting non-tied MFMA to AGPR form If src2 and dst aren't the same register, to fold a copy to AGPR into the instruction we also need to reassign src2 to an available AGPR. All the other uses of src2 also need to be compatible with the AGPR replacement in order to avoid inserting other copies somewhere else. Perform this transform, after verifying all other uses are compatible with AGPR, and have an available AGPR available at all points (which effectively means rewriting a full chain of mfmas and load/store at once). --- .../AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp | 275 ++ ...class-vgpr-mfma-to-av-with-load-source.mir | 51 ++-- 2 files changed, 237 insertions(+), 89 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp index 8569aa7127dc3..dd87b196a24ef 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp @@ -14,12 +14,7 @@ /// MFMA opcode. /// /// TODO: -/// - Handle non-tied dst+src2 cases. We need to try to find a copy from an -///AGPR from src2, or reassign src2 to an available AGPR (which should work -///in the common case of a load). -/// -/// - Handle multiple MFMA uses of the same register. e.g. chained MFMAs that -///can be rewritten as a set +/// - Handle SplitKit partial copy bundles, and not just full copy instructions /// /// - Update LiveIntervals incrementally instead of recomputing from scratch /// @@ -49,13 +44,18 @@ class AMDGPURewriteAGPRCopyMFMAImpl { VirtRegMap &VRM; LiveRegMatrix &LRM; LiveIntervals &LIS; + const RegisterClassInfo &RegClassInfo; + + bool attemptReassignmentsToAGPR(SmallSetVector &InterferingRegs, + MCPhysReg PrefPhysReg) const; public: AMDGPURewriteAGPRCopyMFMAImpl(MachineFunction &MF, VirtRegMap &VRM, -LiveRegMatrix &LRM, LiveIntervals &LIS) +LiveRegMatrix &LRM, LiveIntervals &LIS, +const RegisterClassInfo &RegClassInfo) : ST(MF.getSubtarget()), TII(*ST.getInstrInfo()), TRI(*ST.getRegisterInfo()), MRI(MF.getRegInfo()), VRM(VRM), LRM(LRM), -LIS(LIS) {} +LIS(LIS), RegClassInfo(RegClassInfo) {} bool isRewriteCandidate(const MachineInstr &MI) const { return TII.isMAI(MI) && AMDGPU::getMFMASrcCVDstAGPROp(MI.getOpcode()) != -1; @@ -64,10 +64,10 @@ class AMDGPURewriteAGPRCopyMFMAImpl { /// Compute the register class constraints based on the uses of \p Reg, /// excluding uses from \p ExceptMI. This should be nearly identical to /// MachineRegisterInfo::recomputeRegClass. - const TargetRegisterClass * - recomputeRegClassExceptRewritable(Register Reg, -const TargetRegisterClass *OldRC, -const TargetRegisterClass *NewRC) const; + const TargetRegisterClass *recomputeRegClassExceptRewritable( + Register Reg, const TargetRegisterClass *OldRC, + const TargetRegisterClass *NewRC, + SmallVectorImpl &RewriteCandidates) const; bool run(MachineFunction &MF) const; }; @@ -75,7 +75,8 @@ class AMDGPURewriteAGPRCopyMFMAImpl { const TargetRegisterClass * AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable( Register Reg, const TargetRegisterClass *OldRC, -const TargetRegisterClass *NewRC) const { +const TargetRegisterClass *NewRC, +SmallVectorImpl &RewriteCandidates) const { // Accumulate constraints from all uses. for (MachineOperand &MO : MRI.reg_nodbg_operands(Reg)) { @@ -86,8 +87,11 @@ AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable( // effects of rewrite candidates. It just so happens that we can use either // AGPR or VGPR in src0/src1, so don't bother checking the constraint // effects of the individual operands. -if (isRewriteCandidate(*MI)) +if (isRewriteCandidate(*MI)) { + if (!is_contained(RewriteCandidates, MI)) +RewriteCandidates.push_back(MI); continue; +} unsigned OpNo = &MO - &MI->getOperand(0); NewRC = MI->getRegClassConstraintEffect(OpNo, NewRC, &TII, &TRI); @@ -98,6 +102,58 @@ AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable( return NewRC; } +/// Attempt to reassign the registers in \p InterferingRegs to be AGPRs, with a +/// preference to use \p PhysReg first. Returns false if the reassignments +/// cannot be trivially performed. +bool AMDGPURewriteAGPRCopyMFMAImpl::attemptReassignmentsToAGPR( +SmallSetVector &InterferingRegs, MCPhysReg PrefPhysReg) const { + // FIXME: The ordering may matter here
[llvm-branch-commits] [llvm] AMDGPU: Handle rewriting non-tied MFMA to AGPR form (PR #149027)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/149027 >From bcdb0d78fe8c227e7b2c9b539db496950332f66b Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Jul 2025 12:57:13 +0900 Subject: [PATCH] AMDGPU: Handle rewriting non-tied MFMA to AGPR form If src2 and dst aren't the same register, to fold a copy to AGPR into the instruction we also need to reassign src2 to an available AGPR. All the other uses of src2 also need to be compatible with the AGPR replacement in order to avoid inserting other copies somewhere else. Perform this transform, after verifying all other uses are compatible with AGPR, and have an available AGPR available at all points (which effectively means rewriting a full chain of mfmas and load/store at once). --- .../AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp | 275 ++ ...class-vgpr-mfma-to-av-with-load-source.mir | 51 ++-- 2 files changed, 237 insertions(+), 89 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp index 8569aa7127dc3..dd87b196a24ef 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp @@ -14,12 +14,7 @@ /// MFMA opcode. /// /// TODO: -/// - Handle non-tied dst+src2 cases. We need to try to find a copy from an -///AGPR from src2, or reassign src2 to an available AGPR (which should work -///in the common case of a load). -/// -/// - Handle multiple MFMA uses of the same register. e.g. chained MFMAs that -///can be rewritten as a set +/// - Handle SplitKit partial copy bundles, and not just full copy instructions /// /// - Update LiveIntervals incrementally instead of recomputing from scratch /// @@ -49,13 +44,18 @@ class AMDGPURewriteAGPRCopyMFMAImpl { VirtRegMap &VRM; LiveRegMatrix &LRM; LiveIntervals &LIS; + const RegisterClassInfo &RegClassInfo; + + bool attemptReassignmentsToAGPR(SmallSetVector &InterferingRegs, + MCPhysReg PrefPhysReg) const; public: AMDGPURewriteAGPRCopyMFMAImpl(MachineFunction &MF, VirtRegMap &VRM, -LiveRegMatrix &LRM, LiveIntervals &LIS) +LiveRegMatrix &LRM, LiveIntervals &LIS, +const RegisterClassInfo &RegClassInfo) : ST(MF.getSubtarget()), TII(*ST.getInstrInfo()), TRI(*ST.getRegisterInfo()), MRI(MF.getRegInfo()), VRM(VRM), LRM(LRM), -LIS(LIS) {} +LIS(LIS), RegClassInfo(RegClassInfo) {} bool isRewriteCandidate(const MachineInstr &MI) const { return TII.isMAI(MI) && AMDGPU::getMFMASrcCVDstAGPROp(MI.getOpcode()) != -1; @@ -64,10 +64,10 @@ class AMDGPURewriteAGPRCopyMFMAImpl { /// Compute the register class constraints based on the uses of \p Reg, /// excluding uses from \p ExceptMI. This should be nearly identical to /// MachineRegisterInfo::recomputeRegClass. - const TargetRegisterClass * - recomputeRegClassExceptRewritable(Register Reg, -const TargetRegisterClass *OldRC, -const TargetRegisterClass *NewRC) const; + const TargetRegisterClass *recomputeRegClassExceptRewritable( + Register Reg, const TargetRegisterClass *OldRC, + const TargetRegisterClass *NewRC, + SmallVectorImpl &RewriteCandidates) const; bool run(MachineFunction &MF) const; }; @@ -75,7 +75,8 @@ class AMDGPURewriteAGPRCopyMFMAImpl { const TargetRegisterClass * AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable( Register Reg, const TargetRegisterClass *OldRC, -const TargetRegisterClass *NewRC) const { +const TargetRegisterClass *NewRC, +SmallVectorImpl &RewriteCandidates) const { // Accumulate constraints from all uses. for (MachineOperand &MO : MRI.reg_nodbg_operands(Reg)) { @@ -86,8 +87,11 @@ AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable( // effects of rewrite candidates. It just so happens that we can use either // AGPR or VGPR in src0/src1, so don't bother checking the constraint // effects of the individual operands. -if (isRewriteCandidate(*MI)) +if (isRewriteCandidate(*MI)) { + if (!is_contained(RewriteCandidates, MI)) +RewriteCandidates.push_back(MI); continue; +} unsigned OpNo = &MO - &MI->getOperand(0); NewRC = MI->getRegClassConstraintEffect(OpNo, NewRC, &TII, &TRI); @@ -98,6 +102,58 @@ AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable( return NewRC; } +/// Attempt to reassign the registers in \p InterferingRegs to be AGPRs, with a +/// preference to use \p PhysReg first. Returns false if the reassignments +/// cannot be trivially performed. +bool AMDGPURewriteAGPRCopyMFMAImpl::attemptReassignmentsToAGPR( +SmallSetVector &InterferingRegs, MCPhysReg PrefPhysReg) const { + // FIXME: The ordering may matter here
[llvm-branch-commits] [llvm] AMDGPU: Add a few missing mfma rewrite tests (PR #149026)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/149026 >From 15d9c6ac5705ebceb5c3a8656b2392caf8da6b13 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 16 Jul 2025 13:06:08 +0900 Subject: [PATCH] AMDGPU: Add a few missing mfma rewrite tests Test other splitting situations that appear in greedy. This includes ensuring we have a case that hits a local split and instruction split (most of the tests hit the region split path). Also test a few cases where the final result isn't fully used, resulting in partial copy bundles instead of a simple full copy. Test physreg and virtreg agpr interference with a reassignment candidate. --- ...class-vgpr-mfma-to-agpr-negative-tests.mir | 524 ++ ...class-vgpr-mfma-to-av-with-load-source.mir | 404 +- 2 files changed, 791 insertions(+), 137 deletions(-) diff --git a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir index 3e005df59914e..b4716a293284a 100644 --- a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir +++ b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir @@ -20,6 +20,10 @@ ret void } + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2() #0 { +ret void + } + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_src2_different_subreg() #0 { ret void } @@ -28,7 +32,24 @@ ret void } + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first() #1 { +ret void + } + + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second() #1 { +ret void + } + + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first_physreg() #1 { +ret void + } + + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second_physreg() #1 { +ret void + } + attributes #0 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="8,8" } + attributes #1 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="10,10" } ... # Inflate pattern, except the defining instruction isn't an MFMA. @@ -407,6 +428,89 @@ body: | ... +# Non-mac variant, src2 is a physical register +--- +name: inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2 +tracksRegLiveness: true +machineFunctionInfo: + isEntryFunction: true + stackPtrOffsetReg: '$sgpr32' + occupancy: 10 + sgprForEXECCopy: '$sgpr100_sgpr101' +body: | + ; CHECK-LABEL: name: inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2 + ; CHECK: bb.0: + ; CHECK-NEXT: successors: %bb.1(0x8000) + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: S_NOP 0, implicit-def $agpr0 + ; CHECK-NEXT: renamable $sgpr0 = S_MOV_B32 0 + ; CHECK-NEXT: renamable $vgpr8 = V_MOV_B32_e32 0, implicit $exec + ; CHECK-NEXT: renamable $sgpr1 = COPY renamable $sgpr0 + ; CHECK-NEXT: renamable $vgpr0_vgpr1 = COPY killed renamable $sgpr0_sgpr1 + ; CHECK-NEXT: renamable $vcc = S_AND_B64 $exec, -1, implicit-def dead $scc + ; CHECK-NEXT: dead renamable $vgpr9 = COPY renamable $vgpr8 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: bb.1: + ; CHECK-NEXT: successors: %bb.1(0x4000), %bb.2(0x4000) + ; CHECK-NEXT: liveins: $vcc, $vgpr0_vgpr1 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: early-clobber renamable $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = V_MFMA_F32_32X32X8F16_vgprcd_e64 $vgpr0_vgpr1, $vgpr0_vgpr1, undef $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15, 0, 0, 0, implicit $mode, implicit $exec + ; CHECK-NEXT: S_CBRANCH_VCCNZ %bb.1, implicit $vcc + ; CHECK-NEXT: S_BRANCH %bb.2 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: bb.2: + ; CHECK-NEXT: liveins: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17:0x + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: renamable $agpr0_agpr1_agpr2_agpr3_agpr4_agpr5_agpr6_agpr7_agpr8_agpr9_agpr10_agpr11_agpr12_agpr13_agpr14_agpr15 = COPY killed renamable $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgp
[llvm-branch-commits] [llvm] AMDGPU: Add a few missing mfma rewrite tests (PR #149026)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/149026 >From 15d9c6ac5705ebceb5c3a8656b2392caf8da6b13 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 16 Jul 2025 13:06:08 +0900 Subject: [PATCH] AMDGPU: Add a few missing mfma rewrite tests Test other splitting situations that appear in greedy. This includes ensuring we have a case that hits a local split and instruction split (most of the tests hit the region split path). Also test a few cases where the final result isn't fully used, resulting in partial copy bundles instead of a simple full copy. Test physreg and virtreg agpr interference with a reassignment candidate. --- ...class-vgpr-mfma-to-agpr-negative-tests.mir | 524 ++ ...class-vgpr-mfma-to-av-with-load-source.mir | 404 +- 2 files changed, 791 insertions(+), 137 deletions(-) diff --git a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir index 3e005df59914e..b4716a293284a 100644 --- a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir +++ b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir @@ -20,6 +20,10 @@ ret void } + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2() #0 { +ret void + } + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_src2_different_subreg() #0 { ret void } @@ -28,7 +32,24 @@ ret void } + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first() #1 { +ret void + } + + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second() #1 { +ret void + } + + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first_physreg() #1 { +ret void + } + + define amdgpu_kernel void @inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second_physreg() #1 { +ret void + } + attributes #0 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="8,8" } + attributes #1 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="10,10" } ... # Inflate pattern, except the defining instruction isn't an MFMA. @@ -407,6 +428,89 @@ body: | ... +# Non-mac variant, src2 is a physical register +--- +name: inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2 +tracksRegLiveness: true +machineFunctionInfo: + isEntryFunction: true + stackPtrOffsetReg: '$sgpr32' + occupancy: 10 + sgprForEXECCopy: '$sgpr100_sgpr101' +body: | + ; CHECK-LABEL: name: inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2 + ; CHECK: bb.0: + ; CHECK-NEXT: successors: %bb.1(0x8000) + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: S_NOP 0, implicit-def $agpr0 + ; CHECK-NEXT: renamable $sgpr0 = S_MOV_B32 0 + ; CHECK-NEXT: renamable $vgpr8 = V_MOV_B32_e32 0, implicit $exec + ; CHECK-NEXT: renamable $sgpr1 = COPY renamable $sgpr0 + ; CHECK-NEXT: renamable $vgpr0_vgpr1 = COPY killed renamable $sgpr0_sgpr1 + ; CHECK-NEXT: renamable $vcc = S_AND_B64 $exec, -1, implicit-def dead $scc + ; CHECK-NEXT: dead renamable $vgpr9 = COPY renamable $vgpr8 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: bb.1: + ; CHECK-NEXT: successors: %bb.1(0x4000), %bb.2(0x4000) + ; CHECK-NEXT: liveins: $vcc, $vgpr0_vgpr1 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: early-clobber renamable $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 = V_MFMA_F32_32X32X8F16_vgprcd_e64 $vgpr0_vgpr1, $vgpr0_vgpr1, undef $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15, 0, 0, 0, implicit $mode, implicit $exec + ; CHECK-NEXT: S_CBRANCH_VCCNZ %bb.1, implicit $vcc + ; CHECK-NEXT: S_BRANCH %bb.2 + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: bb.2: + ; CHECK-NEXT: liveins: $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17:0x + ; CHECK-NEXT: {{ $}} + ; CHECK-NEXT: renamable $agpr0_agpr1_agpr2_agpr3_agpr4_agpr5_agpr6_agpr7_agpr8_agpr9_agpr10_agpr11_agpr12_agpr13_agpr14_agpr15 = COPY killed renamable $vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31 + ; CHECK-NEXT: S_NOP 0, implicit-def $vgp
[llvm-branch-commits] [clang] release/21.x: [clang-format] Add AfterNot to SpaceBeforeParensOptions (#150367) (PR #150457)
https://github.com/HazardyKnusperkeks approved this pull request. https://github.com/llvm/llvm-project/pull/150457 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)
https://github.com/snehasish edited https://github.com/llvm/llvm-project/pull/150506 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)
https://github.com/snehasish ready_for_review https://github.com/llvm/llvm-project/pull/150506 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)
snehasish wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/150506?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#150506** https://app.graphite.dev/github/pr/llvm/llvm-project/150506?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/150506?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#150375** https://app.graphite.dev/github/pr/llvm/llvm-project/150375?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#147854** https://app.graphite.dev/github/pr/llvm/llvm-project/147854?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/150506 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/21.x: [clang-format] Fix a bug in `DerivePointerAlignment: true` (#150387) (PR #150458)
https://github.com/HazardyKnusperkeks approved this pull request. https://github.com/llvm/llvm-project/pull/150458 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)
https://github.com/jdenny-ornl edited https://github.com/llvm/llvm-project/pull/128785 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)
@@ -7866,6 +7866,17 @@ The attributes in this metadata is added to all followup loops of the loop distribution pass. See :ref:`Transformation Metadata ` for details. +'``llvm.loop.estimated_trip_count``' Metadata + + +This metadata records the loop's estimated trip count. If it is not present, a +loop's estimated trip count should be computed from any ``branch_weights`` +metadata attached to the latch block's branch instruction. + +Thus, this metadata frees loop transformations to compute latch branch weights +solely for the purpose of maintaining accurate block frequencies instead of +requiring the branch weights to always serve both roles. jdenny-ornl wrote: This is now part of PR #148758. https://github.com/llvm/llvm-project/pull/128785 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)
@@ -7866,6 +7866,17 @@ The attributes in this metadata is added to all followup loops of the loop distribution pass. See :ref:`Transformation Metadata ` for details. +'``llvm.loop.estimated_trip_count``' Metadata jdenny-ornl wrote: This is now part of PR #148758. https://github.com/llvm/llvm-project/pull/128785 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)
@@ -7866,6 +7866,17 @@ The attributes in this metadata is added to all followup loops of the loop distribution pass. See :ref:`Transformation Metadata ` for details. +'``llvm.loop.estimated_trip_count``' Metadata + + jdenny-ornl wrote: This is now part of PR #148758. https://github.com/llvm/llvm-project/pull/128785 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)
@@ -850,27 +852,35 @@ llvm::getLoopEstimatedTripCount(Loop *L, getEstimatedTripCount(LatchBranch, L, ExitWeight)) { if (EstimatedLoopInvocationWeight) *EstimatedLoopInvocationWeight = ExitWeight; + if (auto EstimatedTripCount = jdenny-ornl wrote: I have made PR #148758 the base for this PR, which is now much simpler. https://github.com/llvm/llvm-project/pull/128785 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] callee_type metadata for indirect calls (PR #117036)
@@ -2869,9 +2870,23 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const NamedDecl *ND) { GV->setLinkage(llvm::GlobalValue::ExternalWeakLinkage); } +static bool hasExistingGeneralizedTypeMD(llvm::Function *F) { + llvm::MDNode *MD = F->getMetadata(llvm::LLVMContext::MD_type); + if (!MD) +return false; + return MD->hasGeneralizedMDString(); +} + void CodeGenModule::createFunctionTypeMetadataForIcall(const FunctionDecl *FD, llvm::Function *F) { - // Only if we are checking indirect calls. + if (CodeGenOpts.CallGraphSection && !hasExistingGeneralizedTypeMD(F) && + (!F->hasLocalLinkage() || + F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/true, +/*IgnoreAssumeLikeCalls=*/true, +/*IgnoreLLVMUsed=*/false))) +F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType())); ilovepi wrote: I think this still needs to be addressed... https://github.com/llvm/llvm-project/pull/117036 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][AsmPrinter] Emit call graph section (PR #87576)
@@ -1,40 +1,43 @@ ;; Test if temporary labels are generated for each indirect callsite with a callee_type metadata. -;; Test if the .callgraph section contains the numerical callee type id for each of the temporary -;; labels generated. +;; Test if the .callgraph section contains the MD5 hash of callee type ids generated from +;; generalized type id strings. ; RUN: llc -mtriple=x86_64-unknown-linux --call-graph-section -o - < %s | FileCheck %s ; CHECK: ball: -; CHECK-NEXT: .Lfunc_begin0: +; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]: define ptr @ball() { entry: %fp_foo_val = load ptr, ptr null, align 8 - ; CHECK: .Ltmp0: + ; CHECK: [[LABEL_TMP0:\.Ltmp[0-9]+]]: ilovepi wrote: Do you care that it's `.Ltmp`? I'd assume you just want to match anything after `.L`... https://github.com/llvm/llvm-project/pull/87576 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits