[llvm-branch-commits] [llvm] Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" (PR #134408)

2025-07-24 Thread Sander de Smalen via llvm-branch-commits

sdesmalen-arm wrote:

Gentle ping @arsenm and @qcolombet 

I know that @arsenm is in favour of moving away from `SUBREG_TO_REG` entirely, 
but at the moment it is still used in many places by multiple targets and this 
PR fixes a genuine bug that is exposed with sub-reg liveness tracking.

https://github.com/llvm/llvm-project/pull/134408
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [AArch64, TTI] Disable RealUse check for vector insert/extract costs and Apple CPUs. (#146526) (PR #149815)

2025-07-24 Thread David Green via llvm-branch-commits

https://github.com/davemgreen approved this pull request.

Thanks. The commit message could now do with an adjustment. Otherwise LGTM.

https://github.com/llvm/llvm-project/pull/149815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][NPM] Add isRequired to passes missing it (PR #148115)

2025-07-24 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/148115

>From fe653178dc8c6cfd0929d5ca5dc7c16e224c696a Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Thu, 10 Jul 2025 18:53:39 +0530
Subject: [PATCH] [AMDGPU][NPM] Add isRequired to passes missing it

---
 llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h   | 1 +
 llvm/include/llvm/Transforms/Utils/LoopSimplify.h  | 1 +
 llvm/lib/Target/AMDGPU/AMDGPU.h| 3 +++
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h| 1 +
 llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h | 1 +
 llvm/lib/Target/AMDGPU/GCNNSAReassign.h| 1 +
 llvm/lib/Target/AMDGPU/GCNPreRALongBranchReg.h | 1 +
 llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.h  | 1 +
 llvm/lib/Target/AMDGPU/SIFixSGPRCopies.h   | 1 +
 llvm/lib/Target/AMDGPU/SIFixVGPRCopies.h   | 1 +
 llvm/lib/Target/AMDGPU/SILowerControlFlow.h| 1 +
 llvm/lib/Target/AMDGPU/SILowerSGPRSpills.h | 1 +
 llvm/lib/Target/AMDGPU/SILowerWWMCopies.h  | 1 +
 llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h  | 1 +
 llvm/lib/Target/AMDGPU/SIWholeQuadMode.h   | 1 +
 llvm/test/Feature/optnone-opt.ll   | 1 -
 16 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h 
b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
index f68067d935458..f50511c9c0972 100644
--- a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
+++ b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
@@ -23,6 +23,7 @@ struct StructurizeCFGPass : PassInfoMixin 
{
  function_ref MapClassName2PassName);
 
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 } // namespace llvm
 
diff --git a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h 
b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h
index 8f3fa1f2b18ef..d179002fd6a27 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h
@@ -54,6 +54,7 @@ class ScalarEvolution;
 class LoopSimplifyPass : public PassInfoMixin {
 public:
   LLVM_ABI PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 
 /// Simplify each loop in a loop nest recursively.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 007b481f84960..10507aab9132e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -90,6 +90,7 @@ class SILowerI1CopiesPass : public 
PassInfoMixin {
   SILowerI1CopiesPass() = default;
   PreservedAnalyses run(MachineFunction &MF,
 MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
 };
 
 void initializeAMDGPUDAGToDAGISelLegacyPass(PassRegistry &);
@@ -368,6 +369,7 @@ class SIModeRegisterPass : public 
PassInfoMixin {
 public:
   SIModeRegisterPass() {}
   PreservedAnalyses run(MachineFunction &F, MachineFunctionAnalysisManager 
&AM);
+  static bool isRequired() { return true; }
 };
 
 class SIMemoryLegalizerPass : public PassInfoMixin {
@@ -480,6 +482,7 @@ class SIAnnotateControlFlowPass
 public:
   SIAnnotateControlFlowPass(const AMDGPUTargetMachine &TM) : TM(TM) {}
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 
 void initializeSIAnnotateControlFlowLegacyPass(PassRegistry &);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h 
b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h
index 6123d75d7b616..38fde6ee2f4a5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h
@@ -304,6 +304,7 @@ class AMDGPUISelDAGToDAGPass : public SelectionDAGISelPass {
 
   PreservedAnalyses run(MachineFunction &MF,
 MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
 };
 
 class AMDGPUDAGToDAGISelLegacy : public SelectionDAGISelLegacy {
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h 
b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h
index 2fd98a2ee1a93..d6fb0e53e1169 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h
@@ -29,6 +29,7 @@ class AMDGPUUnifyDivergentExitNodesPass
 : public PassInfoMixin {
 public:
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h 
b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h
index 97a72e7ddbb24..4f2abe0dd0086 100644
--- a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h
+++ b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h
@@ -16,6 +16,7 @@ class GCNNSAReassignPass : public 
PassInfoMixin {
 public:
   PreservedAnalyses run(MachineFunctio

[llvm-branch-commits] [llvm] [AMDGPU][NPM] Add isRequired to passes missing it (PR #148115)

2025-07-24 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/148115

>From fe653178dc8c6cfd0929d5ca5dc7c16e224c696a Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Thu, 10 Jul 2025 18:53:39 +0530
Subject: [PATCH] [AMDGPU][NPM] Add isRequired to passes missing it

---
 llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h   | 1 +
 llvm/include/llvm/Transforms/Utils/LoopSimplify.h  | 1 +
 llvm/lib/Target/AMDGPU/AMDGPU.h| 3 +++
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h| 1 +
 llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h | 1 +
 llvm/lib/Target/AMDGPU/GCNNSAReassign.h| 1 +
 llvm/lib/Target/AMDGPU/GCNPreRALongBranchReg.h | 1 +
 llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.h  | 1 +
 llvm/lib/Target/AMDGPU/SIFixSGPRCopies.h   | 1 +
 llvm/lib/Target/AMDGPU/SIFixVGPRCopies.h   | 1 +
 llvm/lib/Target/AMDGPU/SILowerControlFlow.h| 1 +
 llvm/lib/Target/AMDGPU/SILowerSGPRSpills.h | 1 +
 llvm/lib/Target/AMDGPU/SILowerWWMCopies.h  | 1 +
 llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h  | 1 +
 llvm/lib/Target/AMDGPU/SIWholeQuadMode.h   | 1 +
 llvm/test/Feature/optnone-opt.ll   | 1 -
 16 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h 
b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
index f68067d935458..f50511c9c0972 100644
--- a/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
+++ b/llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
@@ -23,6 +23,7 @@ struct StructurizeCFGPass : PassInfoMixin 
{
  function_ref MapClassName2PassName);
 
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 } // namespace llvm
 
diff --git a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h 
b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h
index 8f3fa1f2b18ef..d179002fd6a27 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopSimplify.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopSimplify.h
@@ -54,6 +54,7 @@ class ScalarEvolution;
 class LoopSimplifyPass : public PassInfoMixin {
 public:
   LLVM_ABI PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 
 /// Simplify each loop in a loop nest recursively.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 007b481f84960..10507aab9132e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -90,6 +90,7 @@ class SILowerI1CopiesPass : public 
PassInfoMixin {
   SILowerI1CopiesPass() = default;
   PreservedAnalyses run(MachineFunction &MF,
 MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
 };
 
 void initializeAMDGPUDAGToDAGISelLegacyPass(PassRegistry &);
@@ -368,6 +369,7 @@ class SIModeRegisterPass : public 
PassInfoMixin {
 public:
   SIModeRegisterPass() {}
   PreservedAnalyses run(MachineFunction &F, MachineFunctionAnalysisManager 
&AM);
+  static bool isRequired() { return true; }
 };
 
 class SIMemoryLegalizerPass : public PassInfoMixin {
@@ -480,6 +482,7 @@ class SIAnnotateControlFlowPass
 public:
   SIAnnotateControlFlowPass(const AMDGPUTargetMachine &TM) : TM(TM) {}
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 
 void initializeSIAnnotateControlFlowLegacyPass(PassRegistry &);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h 
b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h
index 6123d75d7b616..38fde6ee2f4a5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.h
@@ -304,6 +304,7 @@ class AMDGPUISelDAGToDAGPass : public SelectionDAGISelPass {
 
   PreservedAnalyses run(MachineFunction &MF,
 MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
 };
 
 class AMDGPUDAGToDAGISelLegacy : public SelectionDAGISelLegacy {
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h 
b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h
index 2fd98a2ee1a93..d6fb0e53e1169 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUUnifyDivergentExitNodes.h
@@ -29,6 +29,7 @@ class AMDGPUUnifyDivergentExitNodesPass
 : public PassInfoMixin {
 public:
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
+  static bool isRequired() { return true; }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h 
b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h
index 97a72e7ddbb24..4f2abe0dd0086 100644
--- a/llvm/lib/Target/AMDGPU/GCNNSAReassign.h
+++ b/llvm/lib/Target/AMDGPU/GCNNSAReassign.h
@@ -16,6 +16,7 @@ class GCNNSAReassignPass : public 
PassInfoMixin {
 public:
   PreservedAnalyses run(MachineFunctio

[llvm-branch-commits] [clang] [llvm] [DirectX] Add Range Overlap validation to `DXILPostOptimizationValidation.cpp` (PR #148919)

2025-07-24 Thread via llvm-branch-commits

https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/148919

>From 831dc1cab2662151e0c4a95883f6fb73afc595d4 Mon Sep 17 00:00:00 2001
From: joaosaffran 
Date: Tue, 15 Jul 2025 01:59:47 +
Subject: [PATCH 1/6] adding validation

---
 .../DXILPostOptimizationValidation.cpp| 152 --
 ...signature-validation-fail-overlap-range.ll |  16 ++
 2 files changed, 153 insertions(+), 15 deletions(-)
 create mode 100644 
llvm/test/CodeGen/DirectX/rootsignature-validation-fail-overlap-range.ll

diff --git a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp 
b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
index a09c5ac353fed..e42d2bef62ba7 100644
--- a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
+++ b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
@@ -13,6 +13,7 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Analysis/DXILMetadataAnalysis.h"
 #include "llvm/Analysis/DXILResource.h"
+#include "llvm/Frontend/HLSL/RootSignatureValidations.h"
 #include "llvm/IR/DiagnosticInfo.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicsDirectX.h"
@@ -209,6 +210,123 @@ getRootSignature(RootSignatureBindingInfo &RSBI,
   return RootSigDesc;
 }
 
+static void
+reportOverlappingRegisters(Module &M,
+   llvm::hlsl::rootsig::OverlappingRanges Overlap) {
+  const llvm::hlsl::rootsig::RangeInfo *Info = Overlap.A;
+  const llvm::hlsl::rootsig::RangeInfo *OInfo = Overlap.B;
+  SmallString<128> Message;
+  raw_svector_ostream OS(Message);
+  auto ResourceClassToString =
+  [](llvm::dxil::ResourceClass Class) -> const char * {
+switch (Class) {
+
+case ResourceClass::SRV:
+  return "SRV";
+case ResourceClass::UAV:
+  return "UAV";
+case ResourceClass::CBuffer:
+  return "CBuffer";
+case ResourceClass::Sampler:
+  return "Sampler";
+  break;
+}
+  };
+  OS << "register " << ResourceClassToString(Info->Class)
+ << " (space=" << Info->Space << ", register=" << Info->LowerBound << ")"
+ << " is overlapping with"
+ << " register " << ResourceClassToString(OInfo->Class)
+ << " (space=" << OInfo->Space << ", register=" << OInfo->LowerBound << ")"
+ << ", verify your root signature definition";
+
+  M.getContext().diagnose(DiagnosticInfoGeneric(Message));
+}
+
+static bool reportOverlappingRanges(Module &M,
+const mcdxbc::RootSignatureDesc &RSD) {
+  using namespace llvm::hlsl::rootsig;
+
+  llvm::SmallVector Infos;
+  // Helper to map DescriptorRangeType to ResourceClass
+  auto RangeToResourceClass = [](uint32_t RangeType) -> ResourceClass {
+using namespace dxbc;
+switch (static_cast(RangeType)) {
+case DescriptorRangeType::SRV:
+  return ResourceClass::SRV;
+case DescriptorRangeType::UAV:
+  return ResourceClass::UAV;
+case DescriptorRangeType::CBV:
+  return ResourceClass::CBuffer;
+case DescriptorRangeType::Sampler:
+  return ResourceClass::Sampler;
+}
+  };
+
+  // Helper to map RootParameterType to ResourceClass
+  auto ParameterToResourceClass = [](uint32_t Type) -> ResourceClass {
+using namespace dxbc;
+switch (static_cast(Type)) {
+case RootParameterType::SRV:
+  return ResourceClass::SRV;
+case RootParameterType::UAV:
+  return ResourceClass::UAV;
+case RootParameterType::CBV:
+  return ResourceClass::CBuffer;
+default:
+  llvm_unreachable("Unknown RootParameterType");
+}
+  };
+
+  for (size_t I = 0; I < RSD.ParametersContainer.size(); I++) {
+const auto &[Type, Loc] =
+RSD.ParametersContainer.getTypeAndLocForParameter(I);
+const auto &Header = RSD.ParametersContainer.getHeader(I);
+switch (Type) {
+case llvm::to_underlying(dxbc::RootParameterType::SRV):
+case llvm::to_underlying(dxbc::RootParameterType::UAV):
+case llvm::to_underlying(dxbc::RootParameterType::CBV): {
+  dxbc::RTS0::v2::RootDescriptor Desc =
+  RSD.ParametersContainer.getRootDescriptor(Loc);
+
+  RangeInfo Info;
+  Info.Space = Desc.RegisterSpace;
+  Info.LowerBound = Desc.ShaderRegister;
+  Info.UpperBound = Info.LowerBound;
+  Info.Class = ParameterToResourceClass(Type);
+  Info.Visibility = (dxbc::ShaderVisibility)Header.ShaderVisibility;
+
+  Infos.push_back(Info);
+  break;
+}
+case llvm::to_underlying(dxbc::RootParameterType::DescriptorTable): {
+  const mcdxbc::DescriptorTable &Table =
+  RSD.ParametersContainer.getDescriptorTable(Loc);
+
+  for (const dxbc::RTS0::v2::DescriptorRange &Range : Table.Ranges) {
+RangeInfo Info;
+Info.Space = Range.RegisterSpace;
+Info.LowerBound = Range.BaseShaderRegister;
+Info.UpperBound = Info.LowerBound + ((Range.NumDescriptors == ~0U)
+ ? Range.NumDescriptors
+ : Range.NumD

[llvm-branch-commits] [clang] [llvm] [DirectX] Add Range Overlap validation to `DXILPostOptimizationValidation.cpp` (PR #148919)

2025-07-24 Thread via llvm-branch-commits


@@ -0,0 +1,15 @@
+; RUN: not opt -S -passes='dxil-post-optimization-validation' 
-mtriple=dxil-pc-shadermodel6.6-compute %s 2>&1 | FileCheck %s
+; CHECK: error: register CBuffer (space=0, register=0) is overlapping with 
register CBuffer (space=0, register=2), verify your root signature definition
+
+define void @CSMain() "hlsl.shader"="compute" {
+entry:
+  ret void
+}
+
+; RootConstants(num32BitConstants=4, b2), DescriptorTable(CBV(b0, 
numDescriptors=3))

joaosaffran wrote:

I've updated the test, I've added one for descriptor tables and one for root 
descriptors. Hopefully that makes it clearer

https://github.com/llvm/llvm-project/pull/148919
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [DirectX] Add Range Overlap validation to `DXILPostOptimizationValidation.cpp` (PR #148919)

2025-07-24 Thread via llvm-branch-commits

https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/148919

>From 831dc1cab2662151e0c4a95883f6fb73afc595d4 Mon Sep 17 00:00:00 2001
From: joaosaffran 
Date: Tue, 15 Jul 2025 01:59:47 +
Subject: [PATCH 1/7] adding validation

---
 .../DXILPostOptimizationValidation.cpp| 152 --
 ...signature-validation-fail-overlap-range.ll |  16 ++
 2 files changed, 153 insertions(+), 15 deletions(-)
 create mode 100644 
llvm/test/CodeGen/DirectX/rootsignature-validation-fail-overlap-range.ll

diff --git a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp 
b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
index a09c5ac353fed..e42d2bef62ba7 100644
--- a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
+++ b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
@@ -13,6 +13,7 @@
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Analysis/DXILMetadataAnalysis.h"
 #include "llvm/Analysis/DXILResource.h"
+#include "llvm/Frontend/HLSL/RootSignatureValidations.h"
 #include "llvm/IR/DiagnosticInfo.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicsDirectX.h"
@@ -209,6 +210,123 @@ getRootSignature(RootSignatureBindingInfo &RSBI,
   return RootSigDesc;
 }
 
+static void
+reportOverlappingRegisters(Module &M,
+   llvm::hlsl::rootsig::OverlappingRanges Overlap) {
+  const llvm::hlsl::rootsig::RangeInfo *Info = Overlap.A;
+  const llvm::hlsl::rootsig::RangeInfo *OInfo = Overlap.B;
+  SmallString<128> Message;
+  raw_svector_ostream OS(Message);
+  auto ResourceClassToString =
+  [](llvm::dxil::ResourceClass Class) -> const char * {
+switch (Class) {
+
+case ResourceClass::SRV:
+  return "SRV";
+case ResourceClass::UAV:
+  return "UAV";
+case ResourceClass::CBuffer:
+  return "CBuffer";
+case ResourceClass::Sampler:
+  return "Sampler";
+  break;
+}
+  };
+  OS << "register " << ResourceClassToString(Info->Class)
+ << " (space=" << Info->Space << ", register=" << Info->LowerBound << ")"
+ << " is overlapping with"
+ << " register " << ResourceClassToString(OInfo->Class)
+ << " (space=" << OInfo->Space << ", register=" << OInfo->LowerBound << ")"
+ << ", verify your root signature definition";
+
+  M.getContext().diagnose(DiagnosticInfoGeneric(Message));
+}
+
+static bool reportOverlappingRanges(Module &M,
+const mcdxbc::RootSignatureDesc &RSD) {
+  using namespace llvm::hlsl::rootsig;
+
+  llvm::SmallVector Infos;
+  // Helper to map DescriptorRangeType to ResourceClass
+  auto RangeToResourceClass = [](uint32_t RangeType) -> ResourceClass {
+using namespace dxbc;
+switch (static_cast(RangeType)) {
+case DescriptorRangeType::SRV:
+  return ResourceClass::SRV;
+case DescriptorRangeType::UAV:
+  return ResourceClass::UAV;
+case DescriptorRangeType::CBV:
+  return ResourceClass::CBuffer;
+case DescriptorRangeType::Sampler:
+  return ResourceClass::Sampler;
+}
+  };
+
+  // Helper to map RootParameterType to ResourceClass
+  auto ParameterToResourceClass = [](uint32_t Type) -> ResourceClass {
+using namespace dxbc;
+switch (static_cast(Type)) {
+case RootParameterType::SRV:
+  return ResourceClass::SRV;
+case RootParameterType::UAV:
+  return ResourceClass::UAV;
+case RootParameterType::CBV:
+  return ResourceClass::CBuffer;
+default:
+  llvm_unreachable("Unknown RootParameterType");
+}
+  };
+
+  for (size_t I = 0; I < RSD.ParametersContainer.size(); I++) {
+const auto &[Type, Loc] =
+RSD.ParametersContainer.getTypeAndLocForParameter(I);
+const auto &Header = RSD.ParametersContainer.getHeader(I);
+switch (Type) {
+case llvm::to_underlying(dxbc::RootParameterType::SRV):
+case llvm::to_underlying(dxbc::RootParameterType::UAV):
+case llvm::to_underlying(dxbc::RootParameterType::CBV): {
+  dxbc::RTS0::v2::RootDescriptor Desc =
+  RSD.ParametersContainer.getRootDescriptor(Loc);
+
+  RangeInfo Info;
+  Info.Space = Desc.RegisterSpace;
+  Info.LowerBound = Desc.ShaderRegister;
+  Info.UpperBound = Info.LowerBound;
+  Info.Class = ParameterToResourceClass(Type);
+  Info.Visibility = (dxbc::ShaderVisibility)Header.ShaderVisibility;
+
+  Infos.push_back(Info);
+  break;
+}
+case llvm::to_underlying(dxbc::RootParameterType::DescriptorTable): {
+  const mcdxbc::DescriptorTable &Table =
+  RSD.ParametersContainer.getDescriptorTable(Loc);
+
+  for (const dxbc::RTS0::v2::DescriptorRange &Range : Table.Ranges) {
+RangeInfo Info;
+Info.Space = Range.RegisterSpace;
+Info.LowerBound = Range.BaseShaderRegister;
+Info.UpperBound = Info.LowerBound + ((Range.NumDescriptors == ~0U)
+ ? Range.NumDescriptors
+ : Range.NumD

[llvm-branch-commits] [llvm] ecd793c - [AMDGPU] Add v_fma_mix_f32_f16 as an alias of v_fma_mix_f32 on gfx1250 (#150502)

2025-07-24 Thread via llvm-branch-commits

Author: Changpeng Fang
Date: 2025-07-24T12:42:30-07:00
New Revision: ecd793cbb1888507850b806699e97fc978d15dd7

URL: 
https://github.com/llvm/llvm-project/commit/ecd793cbb1888507850b806699e97fc978d15dd7
DIFF: 
https://github.com/llvm/llvm-project/commit/ecd793cbb1888507850b806699e97fc978d15dd7.diff

LOG: [AMDGPU] Add v_fma_mix_f32_f16 as an alias of v_fma_mix_f32 on gfx1250 
(#150502)

Co-authored-by: Jay Foad 

Added: 
llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s

Modified: 
llvm/lib/Target/AMDGPU/VOP3PInstructions.td

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/VOP3PInstructions.td 
b/llvm/lib/Target/AMDGPU/VOP3PInstructions.td
index 7017da9dc3521..c812dc9850514 100644
--- a/llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+++ b/llvm/lib/Target/AMDGPU/VOP3PInstructions.td
@@ -2277,6 +2277,9 @@ defm V_FMA_MIX_F32_BF16 : VOP3P_Realtriple;
 defm V_FMA_MIXLO_BF16   : VOP3P_Realtriple;
 defm V_FMA_MIXHI_BF16   : VOP3P_Realtriple;
 
+let AssemblerPredicate = isGFX1250Plus in
+def : AMDGPUMnemonicAlias<"v_fma_mix_f32_f16",  "v_fma_mix_f32">;
+
 defm V_PK_MINIMUM_F16 : VOP3P_Real_gfx12<0x1d>;
 defm V_PK_MAXIMUM_F16 : VOP3P_Real_gfx12<0x1e>;
 

diff  --git a/llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s 
b/llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s
new file mode 100644
index 0..8d5c11482f909
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s
@@ -0,0 +1,5 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py 
UTC_ARGS: --version 5
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1250 -show-encoding < %s | FileCheck 
--check-prefix=GFX1250 %s
+
+v_fma_mix_f32_f16 v5, v1, v2, s3
+// GFX1250: v_fma_mix_f32 v5, v1, v2, s3; encoding: 
[0x05,0x00,0x20,0xcc,0x01,0x05,0x0e,0x00]



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] d69ea93 - Merge branch 'main' into revert-143441-atomic-control-frontend

2025-07-24 Thread via llvm-branch-commits

Author: Kiran Chandramohan
Date: 2025-07-24T20:43:47+01:00
New Revision: d69ea933c6f243a17d37609d4ac29712dd0b20c6

URL: 
https://github.com/llvm/llvm-project/commit/d69ea933c6f243a17d37609d4ac29712dd0b20c6
DIFF: 
https://github.com/llvm/llvm-project/commit/d69ea933c6f243a17d37609d4ac29712dd0b20c6.diff

LOG: Merge branch 'main' into revert-143441-atomic-control-frontend

Added: 
llvm/test/MC/AMDGPU/gfx1250_asm_vop3p_alias.s

Modified: 
clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
clang-tools-extra/docs/ReleaseNotes.rst

clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-absl.cpp

clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-custom.cpp
llvm/lib/Target/AMDGPU/VOP3PInstructions.td

Removed: 




diff  --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp 
b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
index 7f4ccca84faa5..e1c1bee97f6d4 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
@@ -207,13 +207,9 @@ FormatStringConverter::FormatStringConverter(
   ArgsOffset(FormatArgOffset + 1), LangOpts(LO) {
   assert(ArgsOffset <= NumArgs);
   FormatExpr = llvm::dyn_cast(
-  Args[FormatArgOffset]->IgnoreImplicitAsWritten());
+  Args[FormatArgOffset]->IgnoreUnlessSpelledInSource());
 
-  if (!FormatExpr || !FormatExpr->isOrdinary()) {
-// Function must have a narrow string literal as its first argument.
-conversionNotPossible("first argument is not a narrow string literal");
-return;
-  }
+  assert(FormatExpr && FormatExpr->isOrdinary());
 
   if (const std::optional MaybeMacroName =
   formatStringContainsUnreplaceableMacro(Call, FormatExpr, SM, PP);

diff  --git a/clang-tools-extra/docs/ReleaseNotes.rst 
b/clang-tools-extra/docs/ReleaseNotes.rst
index bde4ddec50ff3..cc77a422b97a6 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -124,6 +124,16 @@ Changes in existing checks
 - Improved :doc:`misc-header-include-cycle
   ` check performance.
 
+- Improved :doc:`modernize-use-std-format
+  ` check to correctly match
+  when the format string is converted to a 
diff erent type by an implicit
+  constructor call.
+
+- Improved :doc:`modernize-use-std-print
+  ` check to correctly match
+  when the format string is converted to a 
diff erent type by an implicit
+  constructor call.
+
 - Improved :doc:`portability-template-virtual-member-function
   ` check to
   avoid false positives on pure virtual member functions.

diff  --git 
a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
 
b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
index 7da0bb02ad766..0f3458e61856a 100644
--- 
a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
+++ 
b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
@@ -2,7 +2,7 @@
 // RUN:   -std=c++20 %s modernize-use-std-format %t --  \
 // RUN:   -config="{CheckOptions: { \
 // RUN:  modernize-use-std-format.StrictMode: true, \
-// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \
+// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; any_format_type_strprintf', \
 // RUN:  modernize-use-std-format.ReplacementFormatFunction: 
'fmt::format', \
 // RUN:  modernize-use-std-format.FormatHeader: '' \
 // RUN:}}"  \
@@ -10,7 +10,7 @@
 // RUN: %check_clang_tidy -check-suffixes=,NOTSTRICT\
 // RUN:   -std=c++20 %s modernize-use-std-format %t --  \
 // RUN:   -config="{CheckOptions: { \
-// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \
+// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; any_format_type_strprintf', \
 // RUN:  modernize-use-std-format.ReplacementFormatFunction: 
'fmt::format', \
 // RUN:  modernize-use-std-format.FormatHeader: '' \
 // RUN:}}"  \
@@ -56,12 +56,17 @@ std::string A(const std::string &in)
 struct S {
   S(...);
 };
-std::string bad_format_type_strprintf(const S &, ...);
+std::string any_format_type_strprintf(const S &, ...);
 
-std::string unsupported

[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)

2025-07-24 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-pgo

Author: Snehasish Kumar (snehasish)


Changes

The test is fine though it seems the checks weren't being enforced because of 
the typo.

---
Full diff: https://github.com/llvm/llvm-project/pull/150506.diff


1 Files Affected:

- (modified) llvm/test/tools/llvm-profdata/memprof-padding-histogram.test 
(+76-76) 


``diff
diff --git a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test 
b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test
index 79521f3aceb6d..2d0346e7cb259 100644
--- a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test
+++ b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test
@@ -21,79 +21,79 @@ CHECK-NEXT: Offset: 0x{{[[:xdigit:]]+}}
 CHECK-NEXT:   -
 
 CHECK:   Records:
-CHEC-NEXTFunctionGUID: {{[0-9]+}}
-CHEC-NEXTAllocSites:
-CHEC-NEXT-
-CHEC-NEXT  Callstack:
-CHEC-NEXT  -
-CHEC-NEXTFunction: {{[0-9]+}}
-CHEC-NEXTSymbolName: main
-CHEC-NEXTLineOffset: 3
-CHEC-NEXTColumn: 10
-CHEC-NEXTInline: 0
-CHEC-NEXT  MemInfoBlock:
-CHEC-NEXTAllocCount: 1
-CHEC-NEXTTotalAccessCount: 5
-CHEC-NEXTMinAccessCount: 5
-CHEC-NEXTMaxAccessCount: 5
-CHEC-NEXTTotalSize: 24
-CHEC-NEXTMinSize: 24
-CHEC-NEXTMaxSize: 24
-CHEC-NEXTAllocTimestamp: {{[0-9]+}}
-CHEC-NEXTDeallocTimestamp: {{[0-9]+}}
-CHEC-NEXTTotalLifetime: 0
-CHEC-NEXTMinLifetime: 0
-CHEC-NEXTMaxLifetime: 0
-CHEC-NEXTAllocCpuId: 11
-CHEC-NEXTDeallocCpuId: 11
-CHEC-NEXTNumMigratedCpu: 0
-CHEC-NEXTNumLifetimeOverlaps: 0
-CHEC-NEXTNumSameAllocCpu: 0
-CHEC-NEXTNumSameDeallocCpu: 0
-CHEC-NEXTDataTypeId: 0
-CHEC-NEXTTotalAccessDensity: 20
-CHEC-NEXTMinAccessDensity: 20
-CHEC-NEXTMaxAccessDensity: 20
-CHEC-NEXTTotalLifetimeAccessDensity: 2
-CHEC-NEXTMinLifetimeAccessDensity: 2
-CHEC-NEXTMaxLifetimeAccessDensity: 2
-CHEC-NEXTAccessHistogramSize: 3
-CHEC-NEXTAccessHistogram: {{[0-9]+}}
-CHEC-NEXTAccessHistogramValues: -2 -1 -2
-CHEC-NEXT-
-CHEC-NEXT  Callstack:
-CHEC-NEXT  -
-CHEC-NEXTFunction: {{[0-9]+}}
-CHEC-NEXTSymbolName: main
-CHEC-NEXTLineOffset: 10
-CHEC-NEXTColumn: 10
-CHEC-NEXTInline: 0
-CHEC-NEXT  MemInfoBlock:
-CHEC-NEXTAllocCount: 1
-CHEC-NEXTTotalAccessCount: 4
-CHEC-NEXTMinAccessCount: 4
-CHEC-NEXTMaxAccessCount: 4
-CHEC-NEXTTotalSize: 48
-CHEC-NEXTMinSize: 48
-CHEC-NEXTMaxSize: 48
-CHEC-NEXTAllocTimestamp: {{[0-9]+}}
-CHEC-NEXTDeallocTimestamp: {{[0-9]+}}
-CHEC-NEXTTotalLifetime: 0
-CHEC-NEXTMinLifetime: 0
-CHEC-NEXTMaxLifetime: 0
-CHEC-NEXTAllocCpuId: 11
-CHEC-NEXTDeallocCpuId: 11
-CHEC-NEXTNumMigratedCpu: 0
-CHEC-NEXTNumLifetimeOverlaps: 0
-CHEC-NEXTNumSameAllocCpu: 0
-CHEC-NEXTNumSameDeallocCpu: 0
-CHEC-NEXTDataTypeId: 0
-CHEC-NEXTTotalAccessDensity: 8
-CHEC-NEXTMinAccessDensity: 8
-CHEC-NEXTMaxAccessDensity: 8
-CHEC-NEXTTotalLifetimeAccessDensity: 8000
-CHEC-NEXTMinLifetimeAccessDensity: 8000
-CHEC-NEXTMaxLifetimeAccessDensity: 8000
-CHEC-NEXTAccessHistogramSize: 6
-CHEC-NEXTAccessHistogram: {{[0-9]+}}
-CHEC-NEXTAccessHistogramValues: -2 -0 -0 -0 -1 -1
+CHECK-NEXTFunctionGUID: {{[0-9]+}}
+CHECK-NEXTAllocSites:
+CHECK-NEXT-
+CHECK-NEXT  Callstack:
+CHECK-NEXT  -
+CHECK-NEXTFunction: {{[0-9]+}}
+CHECK-NEXTSymbolName: main
+CHECK-NEXTLineOffset: 3
+CHECK-NEXTColumn: 10
+CHECK-NEXTInline: 0
+CHECK-NEXT  MemInfoBlock:
+CHECK-NEXTAllocCount: 1
+CHECK-NEXTTotalAccessCount: 5
+CHECK-NEXTMinAccessCount: 5
+CHECK-NEXTMaxAccessCount: 5
+CHECK-NEXTTotalSize: 24
+CHECK-NEXTMinSize: 24
+CHECK-NEXTMaxSize: 24
+CHECK-NEXTAllocTimestamp: {{[0-9]+}}
+CHECK-NEXTDeallocTimestamp: {{[0-9]+}}
+CHECK-NEXTTotalLifetime: 0
+CHECK-NEXTMinLifetime: 0
+CHECK-NEXTMaxLifetime: 0
+CHECK-NEXTAllocCpuId: 11
+CHECK-NEXTDeallocCpuId: 11
+CHECK-NEXTNumMigratedCpu: 0
+CHECK-NEXTNumLifetimeOverlaps: 0
+CHECK-NEXTNumSameAllocCpu: 0
+CHECK-NEXTNumSameDeallocCpu: 0
+CHECK-NEXTDataTypeId: 0
+CHECK-NEXTTotalAccessDensity: 20
+CHECK-NEXTMinAccessDensity: 20
+CHECK-NEXTMaxAccessDensity: 20
+CHECK-NEXTTotalLifetimeAccessDensity: 2
+CHECK-NEXTMinLifetimeAccessDensity: 2
+CHECK-NEXTMaxLifetimeAccessDensity: 2
+CHECK-NEXTAccessHistogramSize: 3
+CHECK-NEXTAccessHistogram: {{[0-9]+}}
+CHECK-NEXTAccessHistogramValues: -2 -1 -2
+CHECK-NE

[llvm-branch-commits] [llvm] [BOLT] Require CFG in BAT mode (PR #150488)

2025-07-24 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/150488
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Require CFG in BAT mode (PR #150488)

2025-07-24 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/150488
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)

2025-07-24 Thread Joel E. Denny via llvm-branch-commits

https://github.com/jdenny-ornl edited 
https://github.com/llvm/llvm-project/pull/128785
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/150468

>From b388252f5857e5004cfd26ab05037f13df66657b Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Fri, 18 Jul 2025 13:03:07 -0700
Subject: [PATCH] [clang-doc] generate comments for functions

Change the function partial to enable comments to be generated for
functions. This only enables the brief comments in the basic project.
---
 .../assets/function-template.mustache |   4 +-
 .../clang-doc/basic-project.mustache.test | 302 +-
 2 files changed, 153 insertions(+), 153 deletions(-)

diff --git a/clang-tools-extra/clang-doc/assets/function-template.mustache 
b/clang-tools-extra/clang-doc/assets/function-template.mustache
index 6683afa03ea43..2510a4de2cd68 100644
--- a/clang-tools-extra/clang-doc/assets/function-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/function-template.mustache
@@ -14,10 +14,10 @@
 
 
 {{! Function Comments }}
-{{#FunctionComments}}
+{{#Description}}
 
 {{>Comments}}
 
-{{/FunctionComments}}
+{{/Description}}
 
 
diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index 7cc32b9d8f08a..4cf8bad32fd9d 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -83,17 +83,17 @@ HTML-SHAPE: 
 HTML-SHAPE: double area ()
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT: Calculates the area of the 
shape.
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE: Calculates the area of the 
shape.
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -103,17 +103,17 @@ HTML-SHAPE: 
 HTML-SHAPE: double perimeter ()
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT: Calculates the perimeter of the 
shape.
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE: Calculates the perimeter of the 
shape.
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -123,14 +123,14 @@ HTML-SHAPE: 
 HTML-SHAPE: void ~Shape ()
 HTML-SHAPE: 
 HTML-SHAPE: 
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT: Virtual destructor.
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
-HTML-SHAPE-NOT:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE: Virtual destructor.
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
+HTML-SHAPE:
 HTML-SHAPE: 
 HTML-SHAPE: 
 HTML-SHAPE: 
@@ -250,17 +250,17 @@ HTML-CALC: 
 HTML-CALC: int add (int a, int b)
 HTML-CALC: 
 HTML-CALC: 
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT: Adds two integers.
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT: 

[llvm-branch-commits] [clang-tools-extra] [clang-doc] Precommit param comment test changes (PR #150469)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/150469

>From 6f213799caf42bb3ba0c00822cef55a2e2948cb4 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Tue, 22 Jul 2025 21:49:57 -0700
Subject: [PATCH] [clang-doc] Precommit param comment test changes

---
 .../clang-doc/basic-project.mustache.test | 92 ++-
 1 file changed, 90 insertions(+), 2 deletions(-)

diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index 4cf8bad32fd9d..807ba1319e393 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -259,7 +259,24 @@ HTML-CALC:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC:
-HTML-CALC:
+HTML-CALC-NOT:
+HTML-CALC-NOT:Parameters
+HTML-CALC-NOT:
+HTML-CALC-NOT:a  
+HTML-CALC-NOT: First integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:b  
+HTML-CALC-NOT: Second integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
 HTML-CALC:
 HTML-CALC: 
 HTML-CALC: 
@@ -299,7 +316,24 @@ HTML-CALC:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC:
-HTML-CALC:
+HTML-CALC-NOT:
+HTML-CALC-NOT:Parameters
+HTML-CALC-NOT:
+HTML-CALC-NOT:a  
+HTML-CALC-NOT: First integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:b  
+HTML-CALC-NOT: Second integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
 HTML-CALC:
 HTML-CALC: 
 HTML-CALC: 
@@ -319,6 +353,23 @@ HTML-CALC:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC:
+HTML-CALC-NOT:
+HTML-CALC-NOT:Parameters
+HTML-CALC-NOT:
+HTML-CALC-NOT:a  
+HTML-CALC-NOT: First integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:b  
+HTML-CALC-NOT: Second integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC: 
@@ -339,6 +390,23 @@ HTML-CALC:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC:
+HTML-CALC-NOT:
+HTML-CALC-NOT:Parameters
+HTML-CALC-NOT:
+HTML-CALC-NOT:a  
+HTML-CALC-NOT: First integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:b  
+HTML-CALC-NOT: Second integer.
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
+HTML-CALC-NOT:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC: 
@@ -438,6 +506,20 @@ HTML-RECTANGLE:
 HTML-RECTANGLE:
 HTML-RECTANGLE:
 HTML-RECTANGLE:
+HTML-RECTANGLE-NOT:
+HTML-RECTANGLE-NOT:Parameters
+HTML-RECTANGLE-NOT:
+HTML-RECTANGLE-NOT:width  
+HTML-RECTANGLE-NOT: Width of the rectan

[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 updated 
https://github.com/llvm/llvm-project/pull/150470

>From 98172493abfb2c93caefe2424dd17b93d32c17a0 Mon Sep 17 00:00:00 2001
From: Erick Velez 
Date: Tue, 22 Jul 2025 21:15:36 -0700
Subject: [PATCH] [clang-doc] add param comments to comment template

---
 clang-tools-extra/clang-doc/JSONGenerator.cpp |   6 +-
 .../assets/comment-template.mustache  |   8 +
 .../clang-doc/basic-project.mustache.test | 180 +-
 3 files changed, 102 insertions(+), 92 deletions(-)

diff --git a/clang-tools-extra/clang-doc/JSONGenerator.cpp 
b/clang-tools-extra/clang-doc/JSONGenerator.cpp
index 92a4117c4e534..5fc28406ee870 100644
--- a/clang-tools-extra/clang-doc/JSONGenerator.cpp
+++ b/clang-tools-extra/clang-doc/JSONGenerator.cpp
@@ -147,8 +147,10 @@ static Object serializeComment(const CommentInfo &I, 
Object &Description) {
 Child.insert({"ParamName", I.ParamName});
 Child.insert({"Direction", I.Direction});
 Child.insert({"Explicit", I.Explicit});
-Child.insert({"Children", ChildArr});
-Obj.insert({commentKindToString(I.Kind), ChildVal});
+auto TextCommentsArray = extractTextComments(CARef.front().getAsObject());
+Child.insert({"Children", TextCommentsArray});
+if (I.Kind == CommentKind::CK_ParamCommandComment)
+  insertComment(Description, ChildVal, "ParamComments");
 return Obj;
   }
 
diff --git a/clang-tools-extra/clang-doc/assets/comment-template.mustache 
b/clang-tools-extra/clang-doc/assets/comment-template.mustache
index f2edb1b2eb9ac..d55a53194ee5c 100644
--- a/clang-tools-extra/clang-doc/assets/comment-template.mustache
+++ b/clang-tools-extra/clang-doc/assets/comment-template.mustache
@@ -24,6 +24,14 @@
 {{>Comments}}
 {{/Children}}
 {{/ParagraphComment}}
+{{#HasParamComments}}
+Parameters
+{{#ParamComments}}
+
+{{ParamName}} {{#Explicit}}{{Direction}}{{/Explicit}} 
{{#Children}}{{>Comments}}{{/Children}}
+ 
+{{/ParamComments}}
+{{/HasParamComments}}
 {{#BlockCommandComment}}
 
 
diff --git a/clang-tools-extra/test/clang-doc/basic-project.mustache.test 
b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
index 807ba1319e393..b55e0abe2cdef 100644
--- a/clang-tools-extra/test/clang-doc/basic-project.mustache.test
+++ b/clang-tools-extra/test/clang-doc/basic-project.mustache.test
@@ -259,24 +259,24 @@ HTML-CALC:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC:
-HTML-CALC-NOT:
-HTML-CALC-NOT:Parameters
-HTML-CALC-NOT:
-HTML-CALC-NOT:a  
-HTML-CALC-NOT: First integer.
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:b  
-HTML-CALC-NOT: Second integer.
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
+HTML-CALC:
+HTML-CALC:Parameters
+HTML-CALC:
+HTML-CALC:a  
+HTML-CALC: First integer.
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:b  
+HTML-CALC: Second integer.
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
+HTML-CALC:
 HTML-CALC:
 HTML-CALC: 
 HTML-CALC: 
@@ -316,24 +316,24 @@ HTML-CALC:
 HTML-CALC:
 HTML-CALC:
 HTML-CALC:
-HTML-CALC-NOT:
-HTML-CALC-NOT:Parameters
-HTML-CALC-NOT:
-HTML-CALC-NOT:a  
-HTML-CALC-NOT: First integer.
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:b  
-HTML-CALC-NOT: Second integer.
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
-HTML-CALC-NOT:
+HTML-CALC:
+HTML-CALC:Parameters
+HTML-CALC:
+HTML-CALC:

[llvm-branch-commits] [llvm] release/21.x: [X86] getTargetConstantBitsFromNode - early-out if the element bitsize doesn't align with the source bitsize (#150184) (PR #150478)

2025-07-24 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang What do you think about merging this PR to the release branch?

LGTM.

https://github.com/llvm/llvm-project/pull/150478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [X86] getTargetConstantBitsFromNode - early-out if the element bitsize doesn't align with the source bitsize (#150184) (PR #150478)

2025-07-24 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang approved this pull request.


https://github.com/llvm/llvm-project/pull/150478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)

2025-07-24 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang approved this pull request.


https://github.com/llvm/llvm-project/pull/150402
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)

2025-07-24 Thread Snehasish Kumar via llvm-branch-commits

https://github.com/snehasish created 
https://github.com/llvm/llvm-project/pull/150506

None

>From f57f3845aa1a6f03a605096e57e5345ebf3131b5 Mon Sep 17 00:00:00 2001
From: Snehasish Kumar 
Date: Thu, 24 Jul 2025 06:25:00 +
Subject: [PATCH] Fix FileCheck prefix in the histogram test.

---
 .../memprof-padding-histogram.test| 152 +-
 1 file changed, 76 insertions(+), 76 deletions(-)

diff --git a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test 
b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test
index 79521f3aceb6d..2d0346e7cb259 100644
--- a/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test
+++ b/llvm/test/tools/llvm-profdata/memprof-padding-histogram.test
@@ -21,79 +21,79 @@ CHECK-NEXT: Offset: 0x{{[[:xdigit:]]+}}
 CHECK-NEXT:   -
 
 CHECK:   Records:
-CHEC-NEXTFunctionGUID: {{[0-9]+}}
-CHEC-NEXTAllocSites:
-CHEC-NEXT-
-CHEC-NEXT  Callstack:
-CHEC-NEXT  -
-CHEC-NEXTFunction: {{[0-9]+}}
-CHEC-NEXTSymbolName: main
-CHEC-NEXTLineOffset: 3
-CHEC-NEXTColumn: 10
-CHEC-NEXTInline: 0
-CHEC-NEXT  MemInfoBlock:
-CHEC-NEXTAllocCount: 1
-CHEC-NEXTTotalAccessCount: 5
-CHEC-NEXTMinAccessCount: 5
-CHEC-NEXTMaxAccessCount: 5
-CHEC-NEXTTotalSize: 24
-CHEC-NEXTMinSize: 24
-CHEC-NEXTMaxSize: 24
-CHEC-NEXTAllocTimestamp: {{[0-9]+}}
-CHEC-NEXTDeallocTimestamp: {{[0-9]+}}
-CHEC-NEXTTotalLifetime: 0
-CHEC-NEXTMinLifetime: 0
-CHEC-NEXTMaxLifetime: 0
-CHEC-NEXTAllocCpuId: 11
-CHEC-NEXTDeallocCpuId: 11
-CHEC-NEXTNumMigratedCpu: 0
-CHEC-NEXTNumLifetimeOverlaps: 0
-CHEC-NEXTNumSameAllocCpu: 0
-CHEC-NEXTNumSameDeallocCpu: 0
-CHEC-NEXTDataTypeId: 0
-CHEC-NEXTTotalAccessDensity: 20
-CHEC-NEXTMinAccessDensity: 20
-CHEC-NEXTMaxAccessDensity: 20
-CHEC-NEXTTotalLifetimeAccessDensity: 2
-CHEC-NEXTMinLifetimeAccessDensity: 2
-CHEC-NEXTMaxLifetimeAccessDensity: 2
-CHEC-NEXTAccessHistogramSize: 3
-CHEC-NEXTAccessHistogram: {{[0-9]+}}
-CHEC-NEXTAccessHistogramValues: -2 -1 -2
-CHEC-NEXT-
-CHEC-NEXT  Callstack:
-CHEC-NEXT  -
-CHEC-NEXTFunction: {{[0-9]+}}
-CHEC-NEXTSymbolName: main
-CHEC-NEXTLineOffset: 10
-CHEC-NEXTColumn: 10
-CHEC-NEXTInline: 0
-CHEC-NEXT  MemInfoBlock:
-CHEC-NEXTAllocCount: 1
-CHEC-NEXTTotalAccessCount: 4
-CHEC-NEXTMinAccessCount: 4
-CHEC-NEXTMaxAccessCount: 4
-CHEC-NEXTTotalSize: 48
-CHEC-NEXTMinSize: 48
-CHEC-NEXTMaxSize: 48
-CHEC-NEXTAllocTimestamp: {{[0-9]+}}
-CHEC-NEXTDeallocTimestamp: {{[0-9]+}}
-CHEC-NEXTTotalLifetime: 0
-CHEC-NEXTMinLifetime: 0
-CHEC-NEXTMaxLifetime: 0
-CHEC-NEXTAllocCpuId: 11
-CHEC-NEXTDeallocCpuId: 11
-CHEC-NEXTNumMigratedCpu: 0
-CHEC-NEXTNumLifetimeOverlaps: 0
-CHEC-NEXTNumSameAllocCpu: 0
-CHEC-NEXTNumSameDeallocCpu: 0
-CHEC-NEXTDataTypeId: 0
-CHEC-NEXTTotalAccessDensity: 8
-CHEC-NEXTMinAccessDensity: 8
-CHEC-NEXTMaxAccessDensity: 8
-CHEC-NEXTTotalLifetimeAccessDensity: 8000
-CHEC-NEXTMinLifetimeAccessDensity: 8000
-CHEC-NEXTMaxLifetimeAccessDensity: 8000
-CHEC-NEXTAccessHistogramSize: 6
-CHEC-NEXTAccessHistogram: {{[0-9]+}}
-CHEC-NEXTAccessHistogramValues: -2 -0 -0 -0 -1 -1
+CHECK-NEXTFunctionGUID: {{[0-9]+}}
+CHECK-NEXTAllocSites:
+CHECK-NEXT-
+CHECK-NEXT  Callstack:
+CHECK-NEXT  -
+CHECK-NEXTFunction: {{[0-9]+}}
+CHECK-NEXTSymbolName: main
+CHECK-NEXTLineOffset: 3
+CHECK-NEXTColumn: 10
+CHECK-NEXTInline: 0
+CHECK-NEXT  MemInfoBlock:
+CHECK-NEXTAllocCount: 1
+CHECK-NEXTTotalAccessCount: 5
+CHECK-NEXTMinAccessCount: 5
+CHECK-NEXTMaxAccessCount: 5
+CHECK-NEXTTotalSize: 24
+CHECK-NEXTMinSize: 24
+CHECK-NEXTMaxSize: 24
+CHECK-NEXTAllocTimestamp: {{[0-9]+}}
+CHECK-NEXTDeallocTimestamp: {{[0-9]+}}
+CHECK-NEXTTotalLifetime: 0
+CHECK-NEXTMinLifetime: 0
+CHECK-NEXTMaxLifetime: 0
+CHECK-NEXTAllocCpuId: 11
+CHECK-NEXTDeallocCpuId: 11
+CHECK-NEXTNumMigratedCpu: 0
+CHECK-NEXTNumLifetimeOverlaps: 0
+CHECK-NEXTNumSameAllocCpu: 0
+CHECK-NEXTNumSameDeallocCpu: 0
+CHECK-NEXTDataTypeId: 0
+CHECK-NEXTTotalAccessDensity: 20
+CHECK-NEXTMinAccessDensity: 20
+CHECK-NEXTMaxAccessDensity: 20
+CHECK-NEXTTotalLifetimeAccessDensity: 2
+CHECK-NEXTMinLifetimeAccessDensity: 2
+CHECK-NEXTMaxLifetimeAccessDensity: 2
+CHECK-NEXTAccessHistogramSize: 3
+CHECK-NEXTAccessHistogram: {{[0-9]+}}
+CHECK-NEXT  

[llvm-branch-commits] [clang-tools-extra] 60bf979 - [clang-tidy] modernize-use-std-print, format: Fix checks with Abseil functions (#142312)

2025-07-24 Thread via llvm-branch-commits

Author: Mike Crowe
Date: 2025-07-24T22:40:41+03:00
New Revision: 60bf97983df3efeb17f6db19b811b68fa74df9aa

URL: 
https://github.com/llvm/llvm-project/commit/60bf97983df3efeb17f6db19b811b68fa74df9aa
DIFF: 
https://github.com/llvm/llvm-project/commit/60bf97983df3efeb17f6db19b811b68fa74df9aa.diff

LOG: [clang-tidy] modernize-use-std-print,format: Fix checks with Abseil 
functions (#142312)

These checks previously failed with absl::StrFormat and absl::PrintF
etc. with:

 Unable to use 'std::format' instead of 'StrFormat' because first
 argument is not a narrow string literal [modernize-use-std-format]

because FormatStringConverter was rejecting the format string if it had
already converted into a different type. Fix the tests so that they
check this case properly by accepting string_view rather than const char
* and fix the check so that these tests pass. Update the existing tests
that checked for the error message that can no longer happen.

Fixes: https://github.com/llvm/llvm-project/issues/129484

Added: 


Modified: 
clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
clang-tools-extra/docs/ReleaseNotes.rst

clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format.cpp
clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-absl.cpp

clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-print-custom.cpp

Removed: 




diff  --git a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp 
b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
index 7f4ccca84faa5..e1c1bee97f6d4 100644
--- a/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
+++ b/clang-tools-extra/clang-tidy/utils/FormatStringConverter.cpp
@@ -207,13 +207,9 @@ FormatStringConverter::FormatStringConverter(
   ArgsOffset(FormatArgOffset + 1), LangOpts(LO) {
   assert(ArgsOffset <= NumArgs);
   FormatExpr = llvm::dyn_cast(
-  Args[FormatArgOffset]->IgnoreImplicitAsWritten());
+  Args[FormatArgOffset]->IgnoreUnlessSpelledInSource());
 
-  if (!FormatExpr || !FormatExpr->isOrdinary()) {
-// Function must have a narrow string literal as its first argument.
-conversionNotPossible("first argument is not a narrow string literal");
-return;
-  }
+  assert(FormatExpr && FormatExpr->isOrdinary());
 
   if (const std::optional MaybeMacroName =
   formatStringContainsUnreplaceableMacro(Call, FormatExpr, SM, PP);

diff  --git a/clang-tools-extra/docs/ReleaseNotes.rst 
b/clang-tools-extra/docs/ReleaseNotes.rst
index bde4ddec50ff3..cc77a422b97a6 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -124,6 +124,16 @@ Changes in existing checks
 - Improved :doc:`misc-header-include-cycle
   ` check performance.
 
+- Improved :doc:`modernize-use-std-format
+  ` check to correctly match
+  when the format string is converted to a 
diff erent type by an implicit
+  constructor call.
+
+- Improved :doc:`modernize-use-std-print
+  ` check to correctly match
+  when the format string is converted to a 
diff erent type by an implicit
+  constructor call.
+
 - Improved :doc:`portability-template-virtual-member-function
   ` check to
   avoid false positives on pure virtual member functions.

diff  --git 
a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
 
b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
index 7da0bb02ad766..0f3458e61856a 100644
--- 
a/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
+++ 
b/clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-format-custom.cpp
@@ -2,7 +2,7 @@
 // RUN:   -std=c++20 %s modernize-use-std-format %t --  \
 // RUN:   -config="{CheckOptions: { \
 // RUN:  modernize-use-std-format.StrictMode: true, \
-// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \
+// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; any_format_type_strprintf', \
 // RUN:  modernize-use-std-format.ReplacementFormatFunction: 
'fmt::format', \
 // RUN:  modernize-use-std-format.FormatHeader: '' \
 // RUN:}}"  \
@@ -10,7 +10,7 @@
 // RUN: %check_clang_tidy -check-suffixes=,NOTSTRICT\
 // RUN:   -std=c++20 %s modernize-use-std-format %t --  \
 // RUN:   -config="{CheckOptions: { \
-// RUN:  modernize-use-std-format.StrFormatLikeFunctions: 
'::strprintf; mynamespace::strprintf2; bad_format_type_strprintf', \
+// RUN:  modernize-use-std-format.S

[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 closed 
https://github.com/llvm/llvm-project/pull/150468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [llvm] Write out raw profile bytes in little endian. (PR #150375)

2025-07-24 Thread Teresa Johnson via llvm-branch-commits


@@ -23,7 +20,16 @@ using ::llvm::memprof::encodeHistogramCount;
 
 namespace {
 template  char *WriteBytes(const T &Pod, char *Buffer) {
-  *(T *)Buffer = Pod;
+  static_assert(is_trivially_copyable::value, "T must be POD");
+  const uint8_t *Src = reinterpret_cast(&Pod);
+  for (size_t I = 0; I < sizeof(T); ++I) {
+Buffer[I] = Src[I];
+  }
+#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  for (size_t i = 0; i < sizeof(T) / 2; ++i) {
+std::swap(buffer[i], buffer[sizeof(T) - 1 - i]);

teresajohnson wrote:

alternatively, copy in from Src above in the current direction if little 
endian, and in reverse order if big endian (rather than copy and swap in the BE 
case)?

https://github.com/llvm/llvm-project/pull/150375
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [llvm] Write out raw profile bytes in little endian. (PR #150375)

2025-07-24 Thread Teresa Johnson via llvm-branch-commits


@@ -23,7 +20,16 @@ using ::llvm::memprof::encodeHistogramCount;
 
 namespace {
 template  char *WriteBytes(const T &Pod, char *Buffer) {
-  *(T *)Buffer = Pod;
+  static_assert(is_trivially_copyable::value, "T must be POD");
+  const uint8_t *Src = reinterpret_cast(&Pod);
+  for (size_t I = 0; I < sizeof(T); ++I) {
+Buffer[I] = Src[I];
+  }
+#if defined(__BYTE_ORDER__) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  for (size_t i = 0; i < sizeof(T) / 2; ++i) {
+std::swap(buffer[i], buffer[sizeof(T) - 1 - i]);

teresajohnson wrote:

buffer should be Buffer?

https://github.com/llvm/llvm-project/pull/150375
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 closed 
https://github.com/llvm/llvm-project/pull/150470
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 reopened 
https://github.com/llvm/llvm-project/pull/150470
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 closed 
https://github.com/llvm/llvm-project/pull/150468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] add param comments to comment template (PR #150470)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 closed 
https://github.com/llvm/llvm-project/pull/150470
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] Precommit param comment test changes (PR #150469)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 closed 
https://github.com/llvm/llvm-project/pull/150469
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-doc] generate comments for functions (PR #150468)

2025-07-24 Thread Erick Velez via llvm-branch-commits

https://github.com/evelez7 reopened 
https://github.com/llvm/llvm-project/pull/150468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Adam Siemieniuk via llvm-branch-commits

https://github.com/adam-smnk approved this pull request.

Looks good, great change 👍

https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Adam Siemieniuk via llvm-branch-commits


@@ -150,9 +150,15 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
 
 `padding_value` specifies a padding value at the boundary on non-perfectly
 divisible dimensions. Padding is optional:
-- If absent, it is UB if the tile does not perfectly divide the dimension.
+- If absent, it is assumed that for all inner tiles,
+  `shape(source)[inner_dims_pos[i]] % inner_tiles[i] == 0`, i.e. all inner
+  tiles divide perfectly the corresponding outer dimension in the result
+  tensor.
 - If present, it will pad along high dimensions (high-padding) to make the
-  tile complete.
+  tile complete. Note that it is not allowed to have artificial padding 
that
+  is not strictly required by linalg.pack (i.e., padding past what is 
needed
+  to complete the last tile along each packed dimension). It is UB if extra
+  padding is requested.

adam-smnk wrote:

> Shouldn't that be verification error?

It's not possible to enforce that with dynamic source.
`UB` is more of "catch all" here and allows `linalg::lowerPack` to remain as is.

> restore UB for the previous point

It could remain there to reinforce the message. But no strong preference here.

https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Adam Siemieniuk via llvm-branch-commits


@@ -4717,6 +4697,12 @@ static LogicalResult commonVerifierPackAndUnPackOp(OpTy 
packOrUnPack) {
 return op->emitError("mismatch in inner tile sizes specified and shaped of 
"
  "tiled dimension in the packed type");
   }
+  if (failed(verifyCompatibleShape(expectedPackedType.getShape(),
+   packedType.getShape( {
+return op->emitError("expected ")
+   << expectedPackedType << " for the unpacked domain value, got "

adam-smnk wrote:

nit: I think it should be `packed domain` - result for `pack`, input for 
`unpack`

https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Adam Siemieniuk via llvm-branch-commits

https://github.com/adam-smnk edited 
https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Test All Projects On Workflow Changes (PR #150250)

2025-07-24 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/150250


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Test All Projects On Workflow Changes (PR #150250)

2025-07-24 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/150250


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)

2025-07-24 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/150251


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)

2025-07-24 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/150251


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)

2025-07-24 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/150251


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CI] Run All Tests When Changing third-party (PR #150251)

2025-07-24 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/150251


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/21.x: [flang][OpenMP] Avoid analyzing assumed-size array bases (#150324) (PR #150411)

2025-07-24 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.


https://github.com/llvm/llvm-project/pull/150411
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen] Prevent register coalescer rematerialization based on target (PR #148430)

2025-07-24 Thread Tomer Shafir via llvm-branch-commits

https://github.com/tomershafir closed 
https://github.com/llvm/llvm-project/pull/148430
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen] Add 2 subtarget hooks canLowerToZeroCycleReg[Move|Zeroing] (PR #148428)

2025-07-24 Thread Tomer Shafir via llvm-branch-commits

https://github.com/tomershafir closed 
https://github.com/llvm/llvm-project/pull/148428
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen] Add 2 subtarget hooks canLowerToZeroCycleReg[Move|Zeroing] (PR #148428)

2025-07-24 Thread Tomer Shafir via llvm-branch-commits

tomershafir wrote:

Retreating back to a single commit patch for all of the changes, as the stacked 
PR is hard to operate.

https://github.com/llvm/llvm-project/pull/148428
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen] Add target hook shouldReMaterializeTrivialRegDef (PR #148429)

2025-07-24 Thread Tomer Shafir via llvm-branch-commits

tomershafir wrote:

Retreating back to a single commit patch for all of the changes, as the stacked 
PR is hard to operate.

https://github.com/llvm/llvm-project/pull/148429
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen] Add target hook shouldReMaterializeTrivialRegDef (PR #148429)

2025-07-24 Thread Tomer Shafir via llvm-branch-commits

https://github.com/tomershafir closed 
https://github.com/llvm/llvm-project/pull/148429
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen] Prevent register coalescer rematerialization based on target (PR #148430)

2025-07-24 Thread Tomer Shafir via llvm-branch-commits

tomershafir wrote:

Retreating back to a single commit patch for all of the changes, as the stacked 
PR is hard to operate.

https://github.com/llvm/llvm-project/pull/148430
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Handle rewriting non-tied MFMA to AGPR form (PR #149027)

2025-07-24 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

ping 

https://github.com/llvm/llvm-project/pull/149027
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] wip: MIR pretty printing for S_WAITCNT_FENCE_soft (PR #150391)

2025-07-24 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Sameer Sahasrabuddhe (ssahasra)


Changes



---

Patch is 34.95 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/150391.diff


7 Files Affected:

- (modified) llvm/lib/CodeGen/MIRParser/MIParser.cpp (+10-15) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp (+161) 
- (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+6-2) 
- (modified) 
llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll (+36-36) 
- (added) llvm/test/CodeGen/AMDGPU/fence-parameters.mir (+29) 
- (modified) llvm/test/CodeGen/AMDGPU/insert-waitcnts-fence-soft.mir (+9-9) 
- (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local.mir (+12-12) 


``diff
diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp 
b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index 3a364d5ff0d20..c8ad286a87a35 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -1850,28 +1850,25 @@ bool MIParser::parseImmediateOperand(MachineOperand 
&Dest) {
   return false;
 }
 
+// The target mnemonic is an expression of the form:
+//
+// Dot(IntegerLiteral|Identifier|Dot)+
+//
+// We could be stricter like not terminating in a dot, but that's note 
important
+// where this is being used.
 bool MIParser::parseTargetImmMnemonic(const unsigned OpCode,
   const unsigned OpIdx,
   MachineOperand &Dest,
   const MIRFormatter &MF) {
   assert(Token.is(MIToken::dot));
   auto Loc = Token.location(); // record start position
-  size_t Len = 1;  // for "."
-  lex();
-
-  // Handle the case that mnemonic starts with number.
-  if (Token.is(MIToken::IntegerLiteral)) {
+  size_t Len = 0;
+  while (Token.is(MIToken::IntegerLiteral) || Token.is(MIToken::dot) ||
+ Token.is(MIToken::Identifier)) {
 Len += Token.range().size();
 lex();
   }
-
-  StringRef Src;
-  if (Token.is(MIToken::comma))
-Src = StringRef(Loc, Len);
-  else {
-assert(Token.is(MIToken::Identifier));
-Src = StringRef(Loc, Len + Token.stringValue().size());
-  }
+  StringRef Src(Loc, Len);
   int64_t Val;
   if (MF.parseImmMnemonic(OpCode, OpIdx, Src, Val,
   [this](StringRef::iterator Loc, const Twine &Msg)
@@ -1879,8 +1876,6 @@ bool MIParser::parseTargetImmMnemonic(const unsigned 
OpCode,
 return true;
 
   Dest = MachineOperand::CreateImm(Val);
-  if (!Token.is(MIToken::comma))
-lex();
   return false;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp
index 75e3d8c426e73..f318d6ffc1bae 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp
@@ -12,10 +12,135 @@
 
//===--===//
 
 #include "AMDGPUMIRFormatter.h"
+#include "SIDefines.h"
 #include "SIMachineFunctionInfo.h"
 
 using namespace llvm;
 
+bool parseAtomicOrdering(StringRef Src, unsigned &Order) {
+  Src.consume_front(".");
+  for (unsigned I = 0; I <= (unsigned)AtomicOrdering::LAST; ++I) {
+if (Src == toIRString((AtomicOrdering)I)) {
+  Order = I;
+  return true;
+}
+  }
+  Order = ~0u;
+  return false;
+}
+
+static const char *fmtScope(unsigned Scope) {
+  static const char *Names[] = {"none",  "singlethread", "wavefront",
+"workgroup", "agent","system"};
+  return Names[Scope];
+}
+
+bool parseAtomicScope(StringRef Src, unsigned &Scope) {
+  Src.consume_front(".");
+  for (unsigned I = 0;
+   I != (unsigned)AMDGPU::SIAtomicScope::NUM_SI_ATOMIC_SCOPES; ++I) {
+if (Src == fmtScope(I)) {
+  Scope = I;
+  return true;
+}
+  }
+  Scope = ~0u;
+  return false;
+}
+
+static const char *fmtAddrSpace(unsigned Space) {
+  static const char *Names[] = {"none","global", "lds",
+"scratch", "gds","other"};
+  return Names[Space];
+}
+
+bool parseOneAddrSpace(StringRef Src, unsigned &AddrSpace) {
+  if (Src == "none") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::NONE;
+return true;
+  }
+  if (Src == "flat") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::FLAT;
+return true;
+  }
+  if (Src == "atomic") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ATOMIC;
+return true;
+  }
+  if (Src == "all") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ALL;
+return true;
+  }
+  for (unsigned I = 1, A = 1; A <= (unsigned)AMDGPU::SIAtomicAddrSpace::LAST;
+   A <<= 1, ++I) {
+if (Src == fmtAddrSpace(I)) {
+  AddrSpace = A;
+  return true;
+}
+  }
+  AddrSpace = ~0u;
+  return false;
+}
+
+bool parseAddrSpace(StringRef Src, unsigned &AddrSpace) {
+  Src = Src.trim();
+  Src.consume_front(".");
+  while (!Src.empty()) {
+auto [First, Rest] = Src.split('.')

[llvm-branch-commits] [llvm] [AMDGPU] wip: MIR pretty printing for S_WAITCNT_FENCE_soft (PR #150391)

2025-07-24 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-globalisel

Author: Sameer Sahasrabuddhe (ssahasra)


Changes



---

Patch is 34.95 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/150391.diff


7 Files Affected:

- (modified) llvm/lib/CodeGen/MIRParser/MIParser.cpp (+10-15) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp (+161) 
- (modified) llvm/lib/Target/AMDGPU/SIDefines.h (+6-2) 
- (modified) 
llvm/test/CodeGen/AMDGPU/GlobalISel/memory-legalizer-atomic-fence.ll (+36-36) 
- (added) llvm/test/CodeGen/AMDGPU/fence-parameters.mir (+29) 
- (modified) llvm/test/CodeGen/AMDGPU/insert-waitcnts-fence-soft.mir (+9-9) 
- (modified) llvm/test/CodeGen/AMDGPU/memory-legalizer-local.mir (+12-12) 


``diff
diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp 
b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index 3a364d5ff0d20..c8ad286a87a35 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -1850,28 +1850,25 @@ bool MIParser::parseImmediateOperand(MachineOperand 
&Dest) {
   return false;
 }
 
+// The target mnemonic is an expression of the form:
+//
+// Dot(IntegerLiteral|Identifier|Dot)+
+//
+// We could be stricter like not terminating in a dot, but that's note 
important
+// where this is being used.
 bool MIParser::parseTargetImmMnemonic(const unsigned OpCode,
   const unsigned OpIdx,
   MachineOperand &Dest,
   const MIRFormatter &MF) {
   assert(Token.is(MIToken::dot));
   auto Loc = Token.location(); // record start position
-  size_t Len = 1;  // for "."
-  lex();
-
-  // Handle the case that mnemonic starts with number.
-  if (Token.is(MIToken::IntegerLiteral)) {
+  size_t Len = 0;
+  while (Token.is(MIToken::IntegerLiteral) || Token.is(MIToken::dot) ||
+ Token.is(MIToken::Identifier)) {
 Len += Token.range().size();
 lex();
   }
-
-  StringRef Src;
-  if (Token.is(MIToken::comma))
-Src = StringRef(Loc, Len);
-  else {
-assert(Token.is(MIToken::Identifier));
-Src = StringRef(Loc, Len + Token.stringValue().size());
-  }
+  StringRef Src(Loc, Len);
   int64_t Val;
   if (MF.parseImmMnemonic(OpCode, OpIdx, Src, Val,
   [this](StringRef::iterator Loc, const Twine &Msg)
@@ -1879,8 +1876,6 @@ bool MIParser::parseTargetImmMnemonic(const unsigned 
OpCode,
 return true;
 
   Dest = MachineOperand::CreateImm(Val);
-  if (!Token.is(MIToken::comma))
-lex();
   return false;
 }
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp
index 75e3d8c426e73..f318d6ffc1bae 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMIRFormatter.cpp
@@ -12,10 +12,135 @@
 
//===--===//
 
 #include "AMDGPUMIRFormatter.h"
+#include "SIDefines.h"
 #include "SIMachineFunctionInfo.h"
 
 using namespace llvm;
 
+bool parseAtomicOrdering(StringRef Src, unsigned &Order) {
+  Src.consume_front(".");
+  for (unsigned I = 0; I <= (unsigned)AtomicOrdering::LAST; ++I) {
+if (Src == toIRString((AtomicOrdering)I)) {
+  Order = I;
+  return true;
+}
+  }
+  Order = ~0u;
+  return false;
+}
+
+static const char *fmtScope(unsigned Scope) {
+  static const char *Names[] = {"none",  "singlethread", "wavefront",
+"workgroup", "agent","system"};
+  return Names[Scope];
+}
+
+bool parseAtomicScope(StringRef Src, unsigned &Scope) {
+  Src.consume_front(".");
+  for (unsigned I = 0;
+   I != (unsigned)AMDGPU::SIAtomicScope::NUM_SI_ATOMIC_SCOPES; ++I) {
+if (Src == fmtScope(I)) {
+  Scope = I;
+  return true;
+}
+  }
+  Scope = ~0u;
+  return false;
+}
+
+static const char *fmtAddrSpace(unsigned Space) {
+  static const char *Names[] = {"none","global", "lds",
+"scratch", "gds","other"};
+  return Names[Space];
+}
+
+bool parseOneAddrSpace(StringRef Src, unsigned &AddrSpace) {
+  if (Src == "none") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::NONE;
+return true;
+  }
+  if (Src == "flat") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::FLAT;
+return true;
+  }
+  if (Src == "atomic") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ATOMIC;
+return true;
+  }
+  if (Src == "all") {
+AddrSpace = (unsigned)AMDGPU::SIAtomicAddrSpace::ALL;
+return true;
+  }
+  for (unsigned I = 1, A = 1; A <= (unsigned)AMDGPU::SIAtomicAddrSpace::LAST;
+   A <<= 1, ++I) {
+if (Src == fmtAddrSpace(I)) {
+  AddrSpace = A;
+  return true;
+}
+  }
+  AddrSpace = ~0u;
+  return false;
+}
+
+bool parseAddrSpace(StringRef Src, unsigned &AddrSpace) {
+  Src = Src.trim();
+  Src.consume_front(".");
+  while (!Src.empty()) {
+auto [First, Rest] = Src.split('.'

[llvm-branch-commits] [llvm] [AMDGPU] wip: MIR pretty printing for S_WAITCNT_FENCE_soft (PR #150391)

2025-07-24 Thread Sameer Sahasrabuddhe via llvm-branch-commits

https://github.com/ssahasra edited 
https://github.com/llvm/llvm-project/pull/150391
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DTLTO] Add LLVM release note for LLVM 21 (PR #150171)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru approved this pull request.


https://github.com/llvm/llvm-project/pull/150171
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang] Fix a crash when equivalence and namelist statements are used (PR #150292)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

Who can review this?

https://github.com/llvm/llvm-project/pull/150292
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][deps] Add a release note for fixing crashes in `clang-scan-deps`. (#149857) (PR #150329)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru approved this pull request.


https://github.com/llvm/llvm-project/pull/150329
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/21.x: [KeyInstr] Fix verifier check (#149043) (PR #149053)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

> As for the pre-merge on this one: the abi-compare bot thing seems cool, 
> though I don't think the reported failure is for this patch, I've not touched 
> any function signatures here

Yeah it's not correct until we made the RC1 release. If this branch is rebased 
it will pass. But I think it's fine enough to merge this at this point.

https://github.com/llvm/llvm-project/pull/149053
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/21.x: [lld] Add thunks for hexagon (#111217) (PR #149723)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/149723

>From 760616dcfde320a2653eab10c5c6a377d9c986c8 Mon Sep 17 00:00:00 2001
From: Brian Cain 
Date: Sun, 20 Jul 2025 11:46:31 -0500
Subject: [PATCH] [lld] Add thunks for hexagon (#111217)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Without thunks, programs will encounter link errors complaining that the
branch target is out of range. Thunks will extend the range of branch
targets, which is a critical need for large programs. Thunks provide
this flexibility at a cost of some modest code size increase.

When configured with the maximal feature set, the hexagon port of the
linux kernel would often encounter these limitations when linking with
`lld`.

The relocations which will be extended by thunks are:

* R_HEX_B22_PCREL, R_HEX_{G,L}D_PLT_B22_PCREL, R_HEX_PLT_B22_PCREL
relocations have a range of ± 8MiB on the baseline
* R_HEX_B15_PCREL: ±65,532 bytes
* R_HEX_B13_PCREL: ±16,380 bytes
* R_HEX_B9_PCREL: ±1,020 bytes

Fixes #149689

Co-authored-by: Alexey Karyakin 

-

Co-authored-by: Alexey Karyakin 
(cherry picked from commit b42f96bc057fd9e31572069b241ba130c21144e5)
---
 lld/ELF/Arch/Hexagon.cpp  |  47 +
 lld/ELF/Relocations.cpp   |  53 +++---
 lld/ELF/Thunks.cpp|  72 -
 lld/test/ELF/hexagon-jump-error.s |  32 --
 lld/test/ELF/hexagon-thunk-range-b22rel.s | 115 
 lld/test/ELF/hexagon-thunk-range-gdplt.s  |  95 +
 lld/test/ELF/hexagon-thunk-range-plt.s|  75 +
 lld/test/ELF/hexagon-thunks-packets.s | 122 ++
 lld/test/ELF/hexagon-thunks.s |  53 ++
 9 files changed, 618 insertions(+), 46 deletions(-)
 delete mode 100644 lld/test/ELF/hexagon-jump-error.s
 create mode 100644 lld/test/ELF/hexagon-thunk-range-b22rel.s
 create mode 100644 lld/test/ELF/hexagon-thunk-range-gdplt.s
 create mode 100644 lld/test/ELF/hexagon-thunk-range-plt.s
 create mode 100644 lld/test/ELF/hexagon-thunks-packets.s
 create mode 100644 lld/test/ELF/hexagon-thunks.s

diff --git a/lld/ELF/Arch/Hexagon.cpp b/lld/ELF/Arch/Hexagon.cpp
index 479131a24dcfc..9b33e78731c97 100644
--- a/lld/ELF/Arch/Hexagon.cpp
+++ b/lld/ELF/Arch/Hexagon.cpp
@@ -11,6 +11,7 @@
 #include "Symbols.h"
 #include "SyntheticSections.h"
 #include "Target.h"
+#include "Thunks.h"
 #include "lld/Common/ErrorHandler.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/BinaryFormat/ELF.h"
@@ -36,6 +37,10 @@ class Hexagon final : public TargetInfo {
  const uint8_t *loc) const override;
   RelType getDynRel(RelType type) const override;
   int64_t getImplicitAddend(const uint8_t *buf, RelType type) const override;
+  bool needsThunk(RelExpr expr, RelType type, const InputFile *file,
+  uint64_t branchAddr, const Symbol &s,
+  int64_t a) const override;
+  bool inBranchRange(RelType type, uint64_t src, uint64_t dst) const override;
   void relocate(uint8_t *loc, const Relocation &rel,
 uint64_t val) const override;
   void writePltHeader(uint8_t *buf) const override;
@@ -63,6 +68,8 @@ Hexagon::Hexagon(Ctx &ctx) : TargetInfo(ctx) {
   tlsGotRel = R_HEX_TPREL_32;
   tlsModuleIndexRel = R_HEX_DTPMOD_32;
   tlsOffsetRel = R_HEX_DTPREL_32;
+
+  needsThunks = true;
 }
 
 uint32_t Hexagon::calcEFlags() const {
@@ -258,6 +265,46 @@ static uint32_t findMaskR16(Ctx &ctx, uint32_t insn) {
 
 static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); }
 
+bool Hexagon::inBranchRange(RelType type, uint64_t src, uint64_t dst) const {
+  int64_t offset = dst - src;
+  switch (type) {
+  case llvm::ELF::R_HEX_B22_PCREL:
+  case llvm::ELF::R_HEX_PLT_B22_PCREL:
+  case llvm::ELF::R_HEX_GD_PLT_B22_PCREL:
+  case llvm::ELF::R_HEX_LD_PLT_B22_PCREL:
+return llvm::isInt<22>(offset >> 2);
+  case llvm::ELF::R_HEX_B15_PCREL:
+return llvm::isInt<15>(offset >> 2);
+break;
+  case llvm::ELF::R_HEX_B13_PCREL:
+return llvm::isInt<13>(offset >> 2);
+break;
+  case llvm::ELF::R_HEX_B9_PCREL:
+return llvm::isInt<9>(offset >> 2);
+  default:
+return true;
+  }
+  llvm_unreachable("unsupported relocation");
+}
+
+bool Hexagon::needsThunk(RelExpr expr, RelType type, const InputFile *file,
+ uint64_t branchAddr, const Symbol &s,
+ int64_t a) const {
+  // Only check branch range for supported branch relocation types
+  switch (type) {
+  case R_HEX_B22_PCREL:
+  case R_HEX_PLT_B22_PCREL:
+  case R_HEX_GD_PLT_B22_PCREL:
+  case R_HEX_LD_PLT_B22_PCREL:
+  case R_HEX_B15_PCREL:
+  case R_HEX_B13_PCREL:
+  case R_HEX_B9_PCREL:
+return !ctx.target->inBranchRange(type, branchAddr, s.getVA(ctx, a));
+  default:
+return false;
+  }
+}
+
 void Hexagon::relocate(uint8_t *loc, const Relocation &rel,
uint64_t val) const {
  

[llvm-branch-commits] [llvm] Propagate Constants for Wave Reduction Intrinsics (PR #150395)

2025-07-24 Thread via llvm-branch-commits

easyonaadit wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/150395?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#150395** https://app.graphite.dev/github/pr/llvm/llvm-project/150395?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/150395?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#150170** https://app.graphite.dev/github/pr/llvm/llvm-project/150170?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#150169** https://app.graphite.dev/github/pr/llvm/llvm-project/150169?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/150395
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/21.x: [compiler-rt][Mips] Fix stat size check on mips64 musl (#143301) (PR #149683)

2025-07-24 Thread via llvm-branch-commits

github-actions[bot] wrote:

@brad0 (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/149683
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239) (PR #149736)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/149736
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239) (PR #149736)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

Closed in favor of #150193 

https://github.com/llvm/llvm-project/pull/149736
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) (PR #149777)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/149777

>From 6dde08705669b8579694aee5b5c6acbb5bdbb492 Mon Sep 17 00:00:00 2001
From: tangaac 
Date: Fri, 18 Jul 2025 16:12:11 +0800
Subject: [PATCH] [LoongArch] Strengthen stack size estimation for LSX/LASX
 extension (#146455)

This patch adds an emergency spill slot when ran out of registers.
PR #139201 introduces `vstelm` instructions with only 8-bit imm offset,
it causes no spill slot to store the spill registers.

(cherry picked from commit 64a0478e08829ec6bcae2b05e154aa58c2c46ac0)
---
 .../LoongArch/LoongArchFrameLowering.cpp  |   7 +-
 .../CodeGen/LoongArch/calling-conv-common.ll  |  48 +--
 .../CodeGen/LoongArch/calling-conv-half.ll|  16 +-
 .../LoongArch/can-not-realign-stack.ll|  44 +--
 .../CodeGen/LoongArch/emergency-spill-slot.ll |   4 +-
 llvm/test/CodeGen/LoongArch/frame.ll  | 107 ++-
 .../CodeGen/LoongArch/intrinsic-memcpy.ll |   8 +-
 llvm/test/CodeGen/LoongArch/lasx/fpowi.ll |  88 +++---
 .../lasx/ir-instruction/extractelement.ll | 120 
 .../ir-instruction/insert-extract-element.ll  |  40 +--
 .../insert-extract-pair-elements.ll   |  40 +--
 .../lasx/ir-instruction/insertelement.ll  | 132 
 llvm/test/CodeGen/LoongArch/llvm.sincos.ll| 150 -
 llvm/test/CodeGen/LoongArch/lsx/pr146455.ll   | 287 ++
 ...realignment-with-variable-sized-objects.ll |  24 +-
 .../CodeGen/LoongArch/stack-realignment.ll|  80 ++---
 .../LoongArch/unaligned-memcpy-inline.ll  |  14 +-
 llvm/test/CodeGen/LoongArch/vararg.ll |  70 ++---
 18 files changed, 823 insertions(+), 456 deletions(-)
 create mode 100644 llvm/test/CodeGen/LoongArch/lsx/pr146455.ll

diff --git a/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp 
b/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp
index ac5e7f3891c72..1493bf4cba695 100644
--- a/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp
@@ -158,7 +158,12 @@ void 
LoongArchFrameLowering::processFunctionBeforeFrameFinalized(
   // estimateStackSize has been observed to under-estimate the final stack
   // size, so give ourselves wiggle-room by checking for stack size
   // representable an 11-bit signed field rather than 12-bits.
-  if (!isInt<11>(MFI.estimateStackSize(MF)))
+  // For [x]vstelm.{b/h/w/d} memory instructions with 8 imm offset, 7-bit
+  // signed field is fine.
+  unsigned EstimateStackSize = MFI.estimateStackSize(MF);
+  if (!isInt<11>(EstimateStackSize) ||
+  (MF.getSubtarget().hasExtLSX() &&
+   !isInt<7>(EstimateStackSize)))
 ScavSlotsNum = std::max(ScavSlotsNum, 1u);
 
   // For CFR spill.
diff --git a/llvm/test/CodeGen/LoongArch/calling-conv-common.ll 
b/llvm/test/CodeGen/LoongArch/calling-conv-common.ll
index d07e2914c753a..f7653af1fa9ba 100644
--- a/llvm/test/CodeGen/LoongArch/calling-conv-common.ll
+++ b/llvm/test/CodeGen/LoongArch/calling-conv-common.ll
@@ -122,23 +122,23 @@ define i64 @callee_large_scalars(i256 %a, i256 %b) 
nounwind {
 define i64 @caller_large_scalars() nounwind {
 ; CHECK-LABEL: caller_large_scalars:
 ; CHECK:   # %bb.0:
-; CHECK-NEXT:addi.d $sp, $sp, -80
-; CHECK-NEXT:st.d $ra, $sp, 72 # 8-byte Folded Spill
-; CHECK-NEXT:st.d $zero, $sp, 24
+; CHECK-NEXT:addi.d $sp, $sp, -96
+; CHECK-NEXT:st.d $ra, $sp, 88 # 8-byte Folded Spill
+; CHECK-NEXT:st.d $zero, $sp, 40
 ; CHECK-NEXT:vrepli.b $vr0, 0
-; CHECK-NEXT:vst $vr0, $sp, 8
+; CHECK-NEXT:vst $vr0, $sp, 24
 ; CHECK-NEXT:ori $a0, $zero, 2
-; CHECK-NEXT:st.d $a0, $sp, 0
-; CHECK-NEXT:st.d $zero, $sp, 56
-; CHECK-NEXT:vst $vr0, $sp, 40
+; CHECK-NEXT:st.d $a0, $sp, 16
+; CHECK-NEXT:st.d $zero, $sp, 72
+; CHECK-NEXT:vst $vr0, $sp, 56
 ; CHECK-NEXT:ori $a2, $zero, 1
-; CHECK-NEXT:addi.d $a0, $sp, 32
-; CHECK-NEXT:addi.d $a1, $sp, 0
-; CHECK-NEXT:st.d $a2, $sp, 32
+; CHECK-NEXT:addi.d $a0, $sp, 48
+; CHECK-NEXT:addi.d $a1, $sp, 16
+; CHECK-NEXT:st.d $a2, $sp, 48
 ; CHECK-NEXT:pcaddu18i $ra, %call36(callee_large_scalars)
 ; CHECK-NEXT:jirl $ra, $ra, 0
-; CHECK-NEXT:ld.d $ra, $sp, 72 # 8-byte Folded Reload
-; CHECK-NEXT:addi.d $sp, $sp, 80
+; CHECK-NEXT:ld.d $ra, $sp, 88 # 8-byte Folded Reload
+; CHECK-NEXT:addi.d $sp, $sp, 96
 ; CHECK-NEXT:ret
   %1 = call i64 @callee_large_scalars(i256 1, i256 2)
   ret i64 %1
@@ -177,20 +177,20 @@ define i64 @callee_large_scalars_exhausted_regs(i64 %a, 
i64 %b, i64 %c, i64 %d,
 define i64 @caller_large_scalars_exhausted_regs() nounwind {
 ; CHECK-LABEL: caller_large_scalars_exhausted_regs:
 ; CHECK:   # %bb.0:
-; CHECK-NEXT:addi.d $sp, $sp, -96
-; CHECK-NEXT:st.d $ra, $sp, 88 # 8-byte Folded Spill
-; CHECK-NEXT:addi.d $a0, $sp, 16
+; CHECK-NEXT:addi.d $sp, $sp, -112
+; CHECK-NEXT:st.d $ra, $sp, 104 # 8-byte Folded Spill
+; CHECK-NEXT:addi.d $a0, $sp, 32
 ; CHECK-NE

[llvm-branch-commits] [llvm] release/21.x: [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) (PR #149777)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/149777
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455) (PR #149777)

2025-07-24 Thread via llvm-branch-commits

github-actions[bot] wrote:

@tangaac (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/149777
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [AArch64, TTI] Disable RealUse check for vector insert/extract costs and Apple CPUs. (#146526) (PR #149815)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

Can you update the PR description to match the reality and I can merge this 
after that.

https://github.com/llvm/llvm-project/pull/149815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Propagate Constants for Wave Reduction Intrinsics (PR #150395)

2025-07-24 Thread via llvm-branch-commits

https://github.com/easyonaadit edited 
https://github.com/llvm/llvm-project/pull/150395
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [MachinePipeliner] Fix incorrect dependency direction (#149436) (PR #149950)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

Can you rebase and squash this PR so that I won't merge a merge commit. 

https://github.com/llvm/llvm-project/pull/149950
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/21.x: [Flang] Fix ASSIGN statement (#149941) (PR #150228)

2025-07-24 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ceseo (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/150228
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/21.x: [libc++] Fix hash_multi{map, set}::insert (#149290) (PR #149435)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/149435

>From 4a4071dc71d87357ea27e81bf46078e03ca9630e Mon Sep 17 00:00:00 2001
From: Nikolas Klauser 
Date: Thu, 17 Jul 2025 23:23:04 +0200
Subject: [PATCH] [libc++] Fix hash_multi{map,set}::insert (#149290)

(cherry picked from commit be3d614cc13f016b16634e18e10caed508d183d2)
---
 libcxx/include/ext/hash_map   |  4 +--
 libcxx/include/ext/hash_set   |  4 +--
 .../gnu/hash_multimap/insert.pass.cpp | 35 +++
 .../gnu/hash_multiset/insert.pass.cpp | 35 +++
 4 files changed, 74 insertions(+), 4 deletions(-)
 create mode 100644 libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp
 create mode 100644 libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp

diff --git a/libcxx/include/ext/hash_map b/libcxx/include/ext/hash_map
index d6b92204f4376..46815eaffa8bd 100644
--- a/libcxx/include/ext/hash_map
+++ b/libcxx/include/ext/hash_map
@@ -744,7 +744,7 @@ public:
   _LIBCPP_HIDE_FROM_ABI const_iterator begin() const { return 
__table_.begin(); }
   _LIBCPP_HIDE_FROM_ABI const_iterator end() const { return __table_.end(); }
 
-  _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return 
__table_.__emplace_unique(__x); }
+  _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return 
__table_.__emplace_multi(__x); }
   _LIBCPP_HIDE_FROM_ABI iterator insert(const_iterator, const value_type& __x) 
{ return insert(__x); }
   template 
   _LIBCPP_HIDE_FROM_ABI void insert(_InputIterator __first, _InputIterator 
__last);
@@ -831,7 +831,7 @@ template 
 template 
 inline void hash_multimap<_Key, _Tp, _Hash, _Pred, 
_Alloc>::insert(_InputIterator __first, _InputIterator __last) {
   for (; __first != __last; ++__first)
-__table_.__emplace_unique(*__first);
+__table_.__emplace_multi(*__first);
 }
 
 template 
diff --git a/libcxx/include/ext/hash_set b/libcxx/include/ext/hash_set
index 7fd5df24ed3a8..62a7a0dbcffb9 100644
--- a/libcxx/include/ext/hash_set
+++ b/libcxx/include/ext/hash_set
@@ -458,7 +458,7 @@ public:
   _LIBCPP_HIDE_FROM_ABI const_iterator begin() const { return 
__table_.begin(); }
   _LIBCPP_HIDE_FROM_ABI const_iterator end() const { return __table_.end(); }
 
-  _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return 
__table_.__emplace_unique(__x); }
+  _LIBCPP_HIDE_FROM_ABI iterator insert(const value_type& __x) { return 
__table_.__emplace_multi(__x); }
   _LIBCPP_HIDE_FROM_ABI iterator insert(const_iterator, const value_type& __x) 
{ return insert(__x); }
   template 
   _LIBCPP_HIDE_FROM_ABI void insert(_InputIterator __first, _InputIterator 
__last);
@@ -543,7 +543,7 @@ template 
 template 
 inline void hash_multiset<_Value, _Hash, _Pred, _Alloc>::insert(_InputIterator 
__first, _InputIterator __last) {
   for (; __first != __last; ++__first)
-__table_.__emplace_unique(*__first);
+__table_.__emplace_multi(*__first);
 }
 
 template 
diff --git a/libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp 
b/libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp
new file mode 100644
index 0..ea80359f1fea2
--- /dev/null
+++ b/libcxx/test/extensions/gnu/hash_multimap/insert.pass.cpp
@@ -0,0 +1,35 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// ADDITIONAL_COMPILE_FLAGS: -Wno-deprecated
+
+// hash_multimap::insert
+
+#include 
+#include 
+
+int main(int, char**) {
+  __gnu_cxx::hash_multimap map;
+
+  map.insert(std::make_pair(1, 1));
+  map.insert(std::make_pair(1, 1));
+
+  assert(map.size() == 2);
+  assert(map.equal_range(1).first == map.begin());
+  assert(map.equal_range(1).second == map.end());
+
+  std::pair arr[] = {std::make_pair(1, 1), std::make_pair(1, 1)};
+
+  map.insert(arr, arr + 2);
+
+  assert(map.size() == 4);
+  assert(map.equal_range(1).first == map.begin());
+  assert(map.equal_range(1).second == map.end());
+
+  return 0;
+}
diff --git a/libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp 
b/libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp
new file mode 100644
index 0..1a60cac158a40
--- /dev/null
+++ b/libcxx/test/extensions/gnu/hash_multiset/insert.pass.cpp
@@ -0,0 +1,35 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// ADDITIONAL_COMPILE_FLAGS: -Wno-deprecated
+
+// hash_multimap

[llvm-branch-commits] [clang] [clang][deps] Add a release note for fixing crashes in `clang-scan-deps`. (#149857) (PR #150329)

2025-07-24 Thread via llvm-branch-commits

github-actions[bot] wrote:

@vsapsai (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/150329
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][deps] Add a release note for fixing crashes in `clang-scan-deps`. (#149857) (PR #150329)

2025-07-24 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/150329
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] release/21.x: [flang][OpenMP] Restore reduction processor behavior broken by #145837 (#150178) (PR #150200)

2025-07-24 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ergawy (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/150200
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)

2025-07-24 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mc

Author: None (llvmbot)


Changes

Backport a073cbbb1aeaaeac01b12e818fe47e4c04080aac

Requested by: @phoebewang

---
Full diff: https://github.com/llvm/llvm-project/pull/150402.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+25-3) 
- (added) llvm/test/MC/X86/intel-syntax-parentheses.s (+10) 


``diff
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp 
b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index b642c1cfe383b..8213e512f45e1 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1042,8 +1042,8 @@ class X86AsmParser : public MCTargetAsmParser {
   }
   PrevState = CurrState;
 }
-void onRParen() {
-  PrevState = State;
+bool onRParen(StringRef &ErrMsg) {
+  IntelExprState CurrState = State;
   switch (State) {
   default:
 State = IES_ERROR;
@@ -1054,9 +1054,27 @@ class X86AsmParser : public MCTargetAsmParser {
   case IES_RBRAC:
   case IES_RPAREN:
 State = IES_RPAREN;
+// In the case of a multiply, onRegister has already set IndexReg
+// directly, with appropriate scale.
+// Otherwise if we just saw a register it has only been stored in
+// TmpReg, so we need to store it into the state machine.
+if (CurrState == IES_REGISTER && PrevState != IES_MULTIPLY) {
+  // If we already have a BaseReg, then assume this is the IndexReg 
with
+  // no explicit scale.
+  if (!BaseReg) {
+BaseReg = TmpReg;
+  } else {
+if (IndexReg)
+  return regsUseUpError(ErrMsg);
+IndexReg = TmpReg;
+Scale = 0;
+  }
+}
 IC.pushOperator(IC_RPAREN);
 break;
   }
+  PrevState = CurrState;
+  return false;
 }
 bool onOffset(const MCExpr *Val, SMLoc OffsetLoc, StringRef ID,
   const InlineAsmIdentifierInfo &IDInfo,
@@ -2172,7 +2190,11 @@ bool 
X86AsmParser::ParseIntelExpression(IntelExprStateMachine &SM, SMLoc &End) {
   }
   break;
 case AsmToken::LParen:  SM.onLParen(); break;
-case AsmToken::RParen:  SM.onRParen(); break;
+case AsmToken::RParen:
+  if (SM.onRParen(ErrMsg)) {
+return Error(Tok.getLoc(), ErrMsg);
+  }
+  break;
 }
 if (SM.hadError())
   return Error(Tok.getLoc(), "unknown token in expression");
diff --git a/llvm/test/MC/X86/intel-syntax-parentheses.s 
b/llvm/test/MC/X86/intel-syntax-parentheses.s
new file mode 100644
index 0..ae53f64089070
--- /dev/null
+++ b/llvm/test/MC/X86/intel-syntax-parentheses.s
@@ -0,0 +1,10 @@
+// RUN: not llvm-mc -triple x86_64-unknown-unknown %s 2>&1 | FileCheck %s
+
+.intel_syntax
+
+// CHECK: error: invalid base+index expression
+lea rdi, [(label + rsi) + rip]
+// CHECK: leaq1(%rax,%rdi), %rdi
+lea rdi, [(rax + rdi) + 1]
+label:
+.quad 42

``




https://github.com/llvm/llvm-project/pull/150402
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)

2025-07-24 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/150402

Backport a073cbbb1aeaaeac01b12e818fe47e4c04080aac

Requested by: @phoebewang

>From b9b8c95fea2cfa8848cdbd2418db41bfafa8706d Mon Sep 17 00:00:00 2001
From: circuit10 
Date: Thu, 24 Jul 2025 10:38:16 +0100
Subject: [PATCH] [X86] Fix misassemble due to not storing registers to state
 machine on RParen (#150252)

This fixes #116883.

The x86 parser saves any register it encounters to a TmpReg field in its
state machine, then on encountering the next valid token immediately
afterwards saves it to either BaseReg, or IndexReg if BaseReg was
already filled. However, this saving logic was missing on the RParen
token handler, causing the parser to "forget" the register immediately
beforehand. This also would prevent later validation logic from
detecting the addressing mode as invalid, leading to a silent
misassembly rather than an error.

(cherry picked from commit a073cbbb1aeaaeac01b12e818fe47e4c04080aac)
---
 .../lib/Target/X86/AsmParser/X86AsmParser.cpp | 28 +--
 llvm/test/MC/X86/intel-syntax-parentheses.s   | 10 +++
 2 files changed, 35 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/MC/X86/intel-syntax-parentheses.s

diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp 
b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index b642c1cfe383b..8213e512f45e1 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1042,8 +1042,8 @@ class X86AsmParser : public MCTargetAsmParser {
   }
   PrevState = CurrState;
 }
-void onRParen() {
-  PrevState = State;
+bool onRParen(StringRef &ErrMsg) {
+  IntelExprState CurrState = State;
   switch (State) {
   default:
 State = IES_ERROR;
@@ -1054,9 +1054,27 @@ class X86AsmParser : public MCTargetAsmParser {
   case IES_RBRAC:
   case IES_RPAREN:
 State = IES_RPAREN;
+// In the case of a multiply, onRegister has already set IndexReg
+// directly, with appropriate scale.
+// Otherwise if we just saw a register it has only been stored in
+// TmpReg, so we need to store it into the state machine.
+if (CurrState == IES_REGISTER && PrevState != IES_MULTIPLY) {
+  // If we already have a BaseReg, then assume this is the IndexReg 
with
+  // no explicit scale.
+  if (!BaseReg) {
+BaseReg = TmpReg;
+  } else {
+if (IndexReg)
+  return regsUseUpError(ErrMsg);
+IndexReg = TmpReg;
+Scale = 0;
+  }
+}
 IC.pushOperator(IC_RPAREN);
 break;
   }
+  PrevState = CurrState;
+  return false;
 }
 bool onOffset(const MCExpr *Val, SMLoc OffsetLoc, StringRef ID,
   const InlineAsmIdentifierInfo &IDInfo,
@@ -2172,7 +2190,11 @@ bool 
X86AsmParser::ParseIntelExpression(IntelExprStateMachine &SM, SMLoc &End) {
   }
   break;
 case AsmToken::LParen:  SM.onLParen(); break;
-case AsmToken::RParen:  SM.onRParen(); break;
+case AsmToken::RParen:
+  if (SM.onRParen(ErrMsg)) {
+return Error(Tok.getLoc(), ErrMsg);
+  }
+  break;
 }
 if (SM.hadError())
   return Error(Tok.getLoc(), "unknown token in expression");
diff --git a/llvm/test/MC/X86/intel-syntax-parentheses.s 
b/llvm/test/MC/X86/intel-syntax-parentheses.s
new file mode 100644
index 0..ae53f64089070
--- /dev/null
+++ b/llvm/test/MC/X86/intel-syntax-parentheses.s
@@ -0,0 +1,10 @@
+// RUN: not llvm-mc -triple x86_64-unknown-unknown %s 2>&1 | FileCheck %s
+
+.intel_syntax
+
+// CHECK: error: invalid base+index expression
+lea rdi, [(label + rsi) + rip]
+// CHECK: leaq1(%rax,%rdi), %rdi
+lea rdi, [(rax + rdi) + 1]
+label:
+.quad 42

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)

2025-07-24 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/150402
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [X86] Fix misassemble due to not storing registers to state machine on RParen (#150252) (PR #150402)

2025-07-24 Thread via llvm-branch-commits

llvmbot wrote:

@phoebewang What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/150402
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [KeyInstr] Inline asm atoms (#149076) (PR #150056)

2025-07-24 Thread Jeremy Morse via llvm-branch-commits

https://github.com/jmorse approved this pull request.

LGTM, and completes key-instr related things in llvm21.

https://github.com/llvm/llvm-project/pull/150056
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Andrzej Warzyński via llvm-branch-commits

https://github.com/banach-space edited 
https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Andrzej Warzyński via llvm-branch-commits




banach-space wrote:

Should we also update:
```
 - The following relationship for the tiled dimensions holds:
 `shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] / 
inner_tiles[i]`.
```
as
```
 - The following relationship for the tiled dimensions holds:
 `shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] ⌈/⌉ 
inner_tiles[i]` (⌈/⌉ - CeilDiv).
```

https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][linalg] Restrict linalg.pack to not have artificial padding. (PR #149624)

2025-07-24 Thread Han-Chung Wang via llvm-branch-commits

https://github.com/hanhanW closed 
https://github.com/llvm/llvm-project/pull/149624
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Require CFG in BAT mode (PR #150488)

2025-07-24 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/150488

>From faf7d914093c87804e9dbca349b1a2bca0aefd18 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 24 Jul 2025 13:56:18 -0700
Subject: [PATCH] updated test

Created using spr 1.3.4
---
 bolt/test/X86/unclaimed-jt-entries.s | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/bolt/test/X86/unclaimed-jt-entries.s 
b/bolt/test/X86/unclaimed-jt-entries.s
index 1102e4ae413e2..b5c5abfbedebc 100644
--- a/bolt/test/X86/unclaimed-jt-entries.s
+++ b/bolt/test/X86/unclaimed-jt-entries.s
@@ -18,6 +18,16 @@
 
 # RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown %s -o %t.o
 # RUN: %clang %cflags -no-pie %t.o -o %t.exe -Wl,-q
+
+## Check that non-simple function profile is emitted in perf2bolt mode
+# RUN: link_fdata %s %t.exe %t.pa PREAGG
+# RUN: llvm-strip -N L5 -N L5_ret %t.exe
+# RUN: perf2bolt %t.exe -p %t.pa --pa -o %t.fdata -strict=0 -print-profile \
+# RUN:   -print-only=main | FileCheck %s --check-prefix=CHECK-P2B
+# CHECK-P2B: PERF2BOLT: traces mismatching disassembled function contents: 0
+# CHECK-P2B: Binary Function "main"
+# CHECK-P2B: IsSimple : 0
+
 # RUN: llvm-bolt %t.exe -v=1 -o %t.out 2>&1 | FileCheck %s
 
 # CHECK: BOLT-WARNING: unclaimed data to code reference (possibly an 
unrecognized jump table entry) to .Ltmp[[#]] in main
@@ -33,8 +43,10 @@
   .size main, .Lend-main
 main:
   jmp *L4-24(,%rdi,8)
-.L5:
+# PREAGG: T #main# #L5# #L5_ret# 1
+L5:
   movl $4, %eax
+L5_ret:
   ret
 .L9:
   movl $2, %eax
@@ -58,7 +70,7 @@ L4:
   .quad .L3
   .quad .L6
   .quad .L3
-  .quad .L5
+  .quad L5
   .quad .L3
   .quad .L3
   .quad .L3

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Require CFG in BAT mode (PR #150488)

2025-07-24 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/150488
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] 32c9e86 - Revert "[flang][flang-driver][mlir][OpenMP] atomic control support (#143441)"

2025-07-24 Thread via llvm-branch-commits

Author: Kiran Chandramohan
Date: 2025-07-24T20:33:43+01:00
New Revision: 32c9e86d027efc84ba696a38ef626ae04d306ec0

URL: 
https://github.com/llvm/llvm-project/commit/32c9e86d027efc84ba696a38ef626ae04d306ec0
DIFF: 
https://github.com/llvm/llvm-project/commit/32c9e86d027efc84ba696a38ef626ae04d306ec0.diff

LOG: Revert "[flang][flang-driver][mlir][OpenMP] atomic control support 
(#143441)"

This reverts commit f44346dc1f6252716cfc62bb0687e3932a93089f.

Added: 


Modified: 
clang/include/clang/Driver/Options.td
flang/include/flang/Frontend/TargetOptions.h
flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
flang/lib/Frontend/CompilerInvocation.cpp
flang/lib/Lower/Bridge.cpp
flang/lib/Lower/OpenMP/Atomic.cpp
flang/lib/Optimizer/Dialect/Support/FIRContext.cpp
mlir/include/mlir/Dialect/OpenMP/OpenMPAttrDefs.td
mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
mlir/test/Dialect/OpenMP/ops.mlir

Removed: 
flang/test/Lower/OpenMP/atomic-control-options.f90



diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index fa248381583cd..916400efdb449 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2320,21 +2320,21 @@ def fsymbol_partition_EQ : Joined<["-"], 
"fsymbol-partition=">, Group,
 
 defm atomic_remote_memory : BoolFOption<"atomic-remote-memory",
   LangOpts<"AtomicRemoteMemory">, DefaultFalse,
-  PosFlag,
-  NegFlag,
-  BothFlags<[], [ClangOption, FlangOption], " atomic operations on remote 
memory">>;
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption], " atomic operations on remote memory">>;
 
 defm atomic_fine_grained_memory : BoolFOption<"atomic-fine-grained-memory",
   LangOpts<"AtomicFineGrainedMemory">, DefaultFalse,
-  PosFlag,
-  NegFlag,
-  BothFlags<[], [ClangOption, FlangOption], " atomic operations on 
fine-grained memory">>;
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption], " atomic operations on fine-grained memory">>;
 
 defm atomic_ignore_denormal_mode : BoolFOption<"atomic-ignore-denormal-mode",
   LangOpts<"AtomicIgnoreDenormalMode">, DefaultFalse,
-  PosFlag,
-  NegFlag,
-  BothFlags<[], [ClangOption, FlangOption], " atomic operations to ignore 
denormal mode">>;
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption], " atomic operations to ignore denormal mode">>;
 
 defm memory_profile : OptInCC1FFlag<"memory-profile", "Enable", "Disable", " 
heap memory profiling">;
 def fmemory_profile_EQ : Joined<["-"], "fmemory-profile=">,
@@ -5360,9 +5360,9 @@ defm amdgpu_precise_memory_op
   " precise memory mode (AMDGPU only)">;
 
 def munsafe_fp_atomics : Flag<["-"], "munsafe-fp-atomics">,
-  Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, 
Alias;
+  Visibility<[ClangOption, CC1Option]>, Alias;
 def mno_unsafe_fp_atomics : Flag<["-"], "mno-unsafe-fp-atomics">,
-  Visibility<[ClangOption, FlangOption]>, 
Alias;
+  Visibility<[ClangOption]>, Alias;
 
 def faltivec : Flag<["-"], "faltivec">, Group;
 def fno_altivec : Flag<["-"], "fno-altivec">, Group;

diff  --git a/flang/include/flang/Frontend/TargetOptions.h 
b/flang/include/flang/Frontend/TargetOptions.h
index f6e5634d5a995..002d8d158abd4 100644
--- a/flang/include/flang/Frontend/TargetOptions.h
+++ b/flang/include/flang/Frontend/TargetOptions.h
@@ -53,11 +53,6 @@ class TargetOptions {
 
   /// Print verbose assembly
   bool asmVerbose = false;
-
-  /// Atomic control options
-  bool atomicIgnoreDenormalMode = false;
-  bool atomicRemoteMemory = false;
-  bool atomicFineGrainedMemory = false;
 };
 
 } // end namespace Fortran::frontend

diff  --git a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h 
b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
index c0c0b744206cd..2df14f83c11e1 100644
--- a/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
+++ b/flang/include/flang/Optimizer/Dialect/Support/FIRContext.h
@@ -58,25 +58,6 @@ void setTargetCPU(mlir::ModuleOp mod, llvm::StringRef cpu);
 /// Get the target CPU string from the Module or return a null reference.
 llvm::StringRef getTargetCPU(mlir::ModuleOp mod);
 
-/// Sets whether Denormal Mode can be ignored or not for lowering of floating
-/// point atomic operations.
-void setAtomicIgnoreDenormalMode(mlir::ModuleOp mod, bool value);
-/// Gets whether Denormal Mode can be ignored or not for lowering of floating
-/// point atomic operations.
-bool getAtomicIgnoreDenormalMode(mlir::ModuleOp mod);
-/// Sets whether fine grained memory can be used or not for lowering of atomic
-/// operations.
-void setAtomicFineGrainedMemory(mlir::ModuleOp mod, bool value);
-/// Gets whether fine grained memory can be used or not for lowering of atomic
-/// operations.
-bool getAtomicFineGrainedMemory(mlir::ModuleOp mod);
-/// Sets whether remote memory can be used or not for lowering of atomic
-/// operations.
-void setAtomicRemoteMe

[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)

2025-07-24 Thread Teresa Johnson via llvm-branch-commits

https://github.com/teresajohnson approved this pull request.


https://github.com/llvm/llvm-project/pull/150506
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [RISCV] Pass sign-extended value to isInt check in expandMul (#150211) (PR #150556)

2025-07-24 Thread Sam Elliott via llvm-branch-commits

https://github.com/lenary approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/150556
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Handle rewriting non-tied MFMA to AGPR form (PR #149027)

2025-07-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/149027

>From bcdb0d78fe8c227e7b2c9b539db496950332f66b Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 11 Jul 2025 12:57:13 +0900
Subject: [PATCH] AMDGPU: Handle rewriting non-tied MFMA to AGPR form

If src2 and dst aren't the same register, to fold a copy
to AGPR into the instruction we also need to reassign src2
to an available AGPR. All the other uses of src2 also need
to be compatible with the AGPR replacement in order to avoid
inserting other copies somewhere else.

Perform this transform, after verifying all other uses are
compatible with AGPR, and have an available AGPR available at
all points (which effectively means rewriting a full chain of
mfmas and load/store at once).
---
 .../AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp  | 275 ++
 ...class-vgpr-mfma-to-av-with-load-source.mir |  51 ++--
 2 files changed, 237 insertions(+), 89 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
index 8569aa7127dc3..dd87b196a24ef 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
@@ -14,12 +14,7 @@
 /// MFMA opcode.
 ///
 /// TODO:
-///  - Handle non-tied dst+src2 cases. We need to try to find a copy from an
-///AGPR from src2, or reassign src2 to an available AGPR (which should work
-///in the common case of a load).
-///
-///  - Handle multiple MFMA uses of the same register. e.g. chained MFMAs that
-///can be rewritten as a set
+///  - Handle SplitKit partial copy bundles, and not just full copy 
instructions
 ///
 ///  - Update LiveIntervals incrementally instead of recomputing from scratch
 ///
@@ -49,13 +44,18 @@ class AMDGPURewriteAGPRCopyMFMAImpl {
   VirtRegMap &VRM;
   LiveRegMatrix &LRM;
   LiveIntervals &LIS;
+  const RegisterClassInfo &RegClassInfo;
+
+  bool attemptReassignmentsToAGPR(SmallSetVector &InterferingRegs,
+  MCPhysReg PrefPhysReg) const;
 
 public:
   AMDGPURewriteAGPRCopyMFMAImpl(MachineFunction &MF, VirtRegMap &VRM,
-LiveRegMatrix &LRM, LiveIntervals &LIS)
+LiveRegMatrix &LRM, LiveIntervals &LIS,
+const RegisterClassInfo &RegClassInfo)
   : ST(MF.getSubtarget()), TII(*ST.getInstrInfo()),
 TRI(*ST.getRegisterInfo()), MRI(MF.getRegInfo()), VRM(VRM), LRM(LRM),
-LIS(LIS) {}
+LIS(LIS), RegClassInfo(RegClassInfo) {}
 
   bool isRewriteCandidate(const MachineInstr &MI) const {
 return TII.isMAI(MI) && AMDGPU::getMFMASrcCVDstAGPROp(MI.getOpcode()) != 
-1;
@@ -64,10 +64,10 @@ class AMDGPURewriteAGPRCopyMFMAImpl {
   /// Compute the register class constraints based on the uses of \p Reg,
   /// excluding uses from \p ExceptMI. This should be nearly identical to
   /// MachineRegisterInfo::recomputeRegClass.
-  const TargetRegisterClass *
-  recomputeRegClassExceptRewritable(Register Reg,
-const TargetRegisterClass *OldRC,
-const TargetRegisterClass *NewRC) const;
+  const TargetRegisterClass *recomputeRegClassExceptRewritable(
+  Register Reg, const TargetRegisterClass *OldRC,
+  const TargetRegisterClass *NewRC,
+  SmallVectorImpl &RewriteCandidates) const;
 
   bool run(MachineFunction &MF) const;
 };
@@ -75,7 +75,8 @@ class AMDGPURewriteAGPRCopyMFMAImpl {
 const TargetRegisterClass *
 AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(
 Register Reg, const TargetRegisterClass *OldRC,
-const TargetRegisterClass *NewRC) const {
+const TargetRegisterClass *NewRC,
+SmallVectorImpl &RewriteCandidates) const {
 
   // Accumulate constraints from all uses.
   for (MachineOperand &MO : MRI.reg_nodbg_operands(Reg)) {
@@ -86,8 +87,11 @@ 
AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(
 // effects of rewrite candidates. It just so happens that we can use either
 // AGPR or VGPR in src0/src1, so don't bother checking the constraint
 // effects of the individual operands.
-if (isRewriteCandidate(*MI))
+if (isRewriteCandidate(*MI)) {
+  if (!is_contained(RewriteCandidates, MI))
+RewriteCandidates.push_back(MI);
   continue;
+}
 
 unsigned OpNo = &MO - &MI->getOperand(0);
 NewRC = MI->getRegClassConstraintEffect(OpNo, NewRC, &TII, &TRI);
@@ -98,6 +102,58 @@ 
AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(
   return NewRC;
 }
 
+/// Attempt to reassign the registers in \p InterferingRegs to be AGPRs, with a
+/// preference to use \p PhysReg first. Returns false if the reassignments
+/// cannot be trivially performed.
+bool AMDGPURewriteAGPRCopyMFMAImpl::attemptReassignmentsToAGPR(
+SmallSetVector &InterferingRegs, MCPhysReg PrefPhysReg) const 
{
+  // FIXME: The ordering may matter here

[llvm-branch-commits] [llvm] AMDGPU: Handle rewriting non-tied MFMA to AGPR form (PR #149027)

2025-07-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/149027

>From bcdb0d78fe8c227e7b2c9b539db496950332f66b Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 11 Jul 2025 12:57:13 +0900
Subject: [PATCH] AMDGPU: Handle rewriting non-tied MFMA to AGPR form

If src2 and dst aren't the same register, to fold a copy
to AGPR into the instruction we also need to reassign src2
to an available AGPR. All the other uses of src2 also need
to be compatible with the AGPR replacement in order to avoid
inserting other copies somewhere else.

Perform this transform, after verifying all other uses are
compatible with AGPR, and have an available AGPR available at
all points (which effectively means rewriting a full chain of
mfmas and load/store at once).
---
 .../AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp  | 275 ++
 ...class-vgpr-mfma-to-av-with-load-source.mir |  51 ++--
 2 files changed, 237 insertions(+), 89 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
index 8569aa7127dc3..dd87b196a24ef 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
@@ -14,12 +14,7 @@
 /// MFMA opcode.
 ///
 /// TODO:
-///  - Handle non-tied dst+src2 cases. We need to try to find a copy from an
-///AGPR from src2, or reassign src2 to an available AGPR (which should work
-///in the common case of a load).
-///
-///  - Handle multiple MFMA uses of the same register. e.g. chained MFMAs that
-///can be rewritten as a set
+///  - Handle SplitKit partial copy bundles, and not just full copy 
instructions
 ///
 ///  - Update LiveIntervals incrementally instead of recomputing from scratch
 ///
@@ -49,13 +44,18 @@ class AMDGPURewriteAGPRCopyMFMAImpl {
   VirtRegMap &VRM;
   LiveRegMatrix &LRM;
   LiveIntervals &LIS;
+  const RegisterClassInfo &RegClassInfo;
+
+  bool attemptReassignmentsToAGPR(SmallSetVector &InterferingRegs,
+  MCPhysReg PrefPhysReg) const;
 
 public:
   AMDGPURewriteAGPRCopyMFMAImpl(MachineFunction &MF, VirtRegMap &VRM,
-LiveRegMatrix &LRM, LiveIntervals &LIS)
+LiveRegMatrix &LRM, LiveIntervals &LIS,
+const RegisterClassInfo &RegClassInfo)
   : ST(MF.getSubtarget()), TII(*ST.getInstrInfo()),
 TRI(*ST.getRegisterInfo()), MRI(MF.getRegInfo()), VRM(VRM), LRM(LRM),
-LIS(LIS) {}
+LIS(LIS), RegClassInfo(RegClassInfo) {}
 
   bool isRewriteCandidate(const MachineInstr &MI) const {
 return TII.isMAI(MI) && AMDGPU::getMFMASrcCVDstAGPROp(MI.getOpcode()) != 
-1;
@@ -64,10 +64,10 @@ class AMDGPURewriteAGPRCopyMFMAImpl {
   /// Compute the register class constraints based on the uses of \p Reg,
   /// excluding uses from \p ExceptMI. This should be nearly identical to
   /// MachineRegisterInfo::recomputeRegClass.
-  const TargetRegisterClass *
-  recomputeRegClassExceptRewritable(Register Reg,
-const TargetRegisterClass *OldRC,
-const TargetRegisterClass *NewRC) const;
+  const TargetRegisterClass *recomputeRegClassExceptRewritable(
+  Register Reg, const TargetRegisterClass *OldRC,
+  const TargetRegisterClass *NewRC,
+  SmallVectorImpl &RewriteCandidates) const;
 
   bool run(MachineFunction &MF) const;
 };
@@ -75,7 +75,8 @@ class AMDGPURewriteAGPRCopyMFMAImpl {
 const TargetRegisterClass *
 AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(
 Register Reg, const TargetRegisterClass *OldRC,
-const TargetRegisterClass *NewRC) const {
+const TargetRegisterClass *NewRC,
+SmallVectorImpl &RewriteCandidates) const {
 
   // Accumulate constraints from all uses.
   for (MachineOperand &MO : MRI.reg_nodbg_operands(Reg)) {
@@ -86,8 +87,11 @@ 
AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(
 // effects of rewrite candidates. It just so happens that we can use either
 // AGPR or VGPR in src0/src1, so don't bother checking the constraint
 // effects of the individual operands.
-if (isRewriteCandidate(*MI))
+if (isRewriteCandidate(*MI)) {
+  if (!is_contained(RewriteCandidates, MI))
+RewriteCandidates.push_back(MI);
   continue;
+}
 
 unsigned OpNo = &MO - &MI->getOperand(0);
 NewRC = MI->getRegClassConstraintEffect(OpNo, NewRC, &TII, &TRI);
@@ -98,6 +102,58 @@ 
AMDGPURewriteAGPRCopyMFMAImpl::recomputeRegClassExceptRewritable(
   return NewRC;
 }
 
+/// Attempt to reassign the registers in \p InterferingRegs to be AGPRs, with a
+/// preference to use \p PhysReg first. Returns false if the reassignments
+/// cannot be trivially performed.
+bool AMDGPURewriteAGPRCopyMFMAImpl::attemptReassignmentsToAGPR(
+SmallSetVector &InterferingRegs, MCPhysReg PrefPhysReg) const 
{
+  // FIXME: The ordering may matter here

[llvm-branch-commits] [llvm] AMDGPU: Add a few missing mfma rewrite tests (PR #149026)

2025-07-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/149026

>From 15d9c6ac5705ebceb5c3a8656b2392caf8da6b13 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 16 Jul 2025 13:06:08 +0900
Subject: [PATCH] AMDGPU: Add a few missing mfma rewrite tests

Test other splitting situations that appear in greedy.
This includes ensuring we have a case that hits a local split
and instruction split (most of the tests hit the region split path).

Also test a few cases where the final result isn't fully used, resulting
in partial copy bundles instead of a simple full copy. Test physreg
and virtreg agpr interference with a reassignment candidate.
---
 ...class-vgpr-mfma-to-agpr-negative-tests.mir | 524 ++
 ...class-vgpr-mfma-to-av-with-load-source.mir | 404 +-
 2 files changed, 791 insertions(+), 137 deletions(-)

diff --git 
a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
 
b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
index 3e005df59914e..b4716a293284a 100644
--- 
a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
+++ 
b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
@@ -20,6 +20,10 @@
 ret void
   }
 
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2() #0 {
+ret void
+  }
+
   define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_src2_different_subreg()
 #0 {
 ret void
   }
@@ -28,7 +32,24 @@
 ret void
   }
 
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first()
 #1 {
+ret void
+  }
+
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second()
 #1 {
+ret void
+  }
+
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first_physreg()
 #1 {
+ret void
+  }
+
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second_physreg()
 #1 {
+ret void
+  }
+
   attributes #0 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="8,8" }
+  attributes #1 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="10,10" 
}
 ...
 
 # Inflate pattern, except the defining instruction isn't an MFMA.
@@ -407,6 +428,89 @@ body: |
 
 ...
 
+# Non-mac variant, src2 is a physical register
+---
+name:
inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2
+tracksRegLiveness: true
+machineFunctionInfo:
+  isEntryFunction: true
+  stackPtrOffsetReg: '$sgpr32'
+  occupancy:   10
+  sgprForEXECCopy: '$sgpr100_sgpr101'
+body: |
+  ; CHECK-LABEL: name: 
inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x8000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_NOP 0, implicit-def $agpr0
+  ; CHECK-NEXT:   renamable $sgpr0 = S_MOV_B32 0
+  ; CHECK-NEXT:   renamable $vgpr8 = V_MOV_B32_e32 0, implicit $exec
+  ; CHECK-NEXT:   renamable $sgpr1 = COPY renamable $sgpr0
+  ; CHECK-NEXT:   renamable $vgpr0_vgpr1 = COPY killed renamable $sgpr0_sgpr1
+  ; CHECK-NEXT:   renamable $vcc = S_AND_B64 $exec, -1, implicit-def dead $scc
+  ; CHECK-NEXT:   dead renamable $vgpr9 = COPY renamable $vgpr8
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.1(0x4000), %bb.2(0x4000)
+  ; CHECK-NEXT:   liveins: $vcc, $vgpr0_vgpr1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   early-clobber renamable 
$vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17
 = V_MFMA_F32_32X32X8F16_vgprcd_e64 $vgpr0_vgpr1, $vgpr0_vgpr1, undef 
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15,
 0, 0, 0, implicit $mode, implicit $exec
+  ; CHECK-NEXT:   S_CBRANCH_VCCNZ %bb.1, implicit $vcc
+  ; CHECK-NEXT:   S_BRANCH %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   liveins: 
$vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17:0x
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   renamable 
$agpr0_agpr1_agpr2_agpr3_agpr4_agpr5_agpr6_agpr7_agpr8_agpr9_agpr10_agpr11_agpr12_agpr13_agpr14_agpr15
 = COPY killed renamable 
$vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgp

[llvm-branch-commits] [llvm] AMDGPU: Add a few missing mfma rewrite tests (PR #149026)

2025-07-24 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/149026

>From 15d9c6ac5705ebceb5c3a8656b2392caf8da6b13 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 16 Jul 2025 13:06:08 +0900
Subject: [PATCH] AMDGPU: Add a few missing mfma rewrite tests

Test other splitting situations that appear in greedy.
This includes ensuring we have a case that hits a local split
and instruction split (most of the tests hit the region split path).

Also test a few cases where the final result isn't fully used, resulting
in partial copy bundles instead of a simple full copy. Test physreg
and virtreg agpr interference with a reassignment candidate.
---
 ...class-vgpr-mfma-to-agpr-negative-tests.mir | 524 ++
 ...class-vgpr-mfma-to-av-with-load-source.mir | 404 +-
 2 files changed, 791 insertions(+), 137 deletions(-)

diff --git 
a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
 
b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
index 3e005df59914e..b4716a293284a 100644
--- 
a/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
+++ 
b/llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-agpr-negative-tests.mir
@@ -20,6 +20,10 @@
 ret void
   }
 
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2() #0 {
+ret void
+  }
+
   define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_src2_different_subreg()
 #0 {
 ret void
   }
@@ -28,7 +32,24 @@
 ret void
   }
 
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first()
 #1 {
+ret void
+  }
+
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second()
 #1 {
+ret void
+  }
+
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_first_physreg()
 #1 {
+ret void
+  }
+
+  define amdgpu_kernel void 
@inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_chain_no_agprs_second_physreg()
 #1 {
+ret void
+  }
+
   attributes #0 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="8,8" }
+  attributes #1 = { "amdgpu-wave-limiter"="true" "amdgpu-waves-per-eu"="10,10" 
}
 ...
 
 # Inflate pattern, except the defining instruction isn't an MFMA.
@@ -407,6 +428,89 @@ body: |
 
 ...
 
+# Non-mac variant, src2 is a physical register
+---
+name:
inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2
+tracksRegLiveness: true
+machineFunctionInfo:
+  isEntryFunction: true
+  stackPtrOffsetReg: '$sgpr32'
+  occupancy:   10
+  sgprForEXECCopy: '$sgpr100_sgpr101'
+body: |
+  ; CHECK-LABEL: name: 
inflate_result_to_agpr__V_MFMA_F32_32X32X8F16_vgprcd_e64_physreg_src2
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x8000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   S_NOP 0, implicit-def $agpr0
+  ; CHECK-NEXT:   renamable $sgpr0 = S_MOV_B32 0
+  ; CHECK-NEXT:   renamable $vgpr8 = V_MOV_B32_e32 0, implicit $exec
+  ; CHECK-NEXT:   renamable $sgpr1 = COPY renamable $sgpr0
+  ; CHECK-NEXT:   renamable $vgpr0_vgpr1 = COPY killed renamable $sgpr0_sgpr1
+  ; CHECK-NEXT:   renamable $vcc = S_AND_B64 $exec, -1, implicit-def dead $scc
+  ; CHECK-NEXT:   dead renamable $vgpr9 = COPY renamable $vgpr8
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.1(0x4000), %bb.2(0x4000)
+  ; CHECK-NEXT:   liveins: $vcc, $vgpr0_vgpr1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   early-clobber renamable 
$vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17
 = V_MFMA_F32_32X32X8F16_vgprcd_e64 $vgpr0_vgpr1, $vgpr0_vgpr1, undef 
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15,
 0, 0, 0, implicit $mode, implicit $exec
+  ; CHECK-NEXT:   S_CBRANCH_VCCNZ %bb.1, implicit $vcc
+  ; CHECK-NEXT:   S_BRANCH %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   liveins: 
$vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17:0x
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   renamable 
$agpr0_agpr1_agpr2_agpr3_agpr4_agpr5_agpr6_agpr7_agpr8_agpr9_agpr10_agpr11_agpr12_agpr13_agpr14_agpr15
 = COPY killed renamable 
$vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7_vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15_vgpr16_vgpr17
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr8_vgpr9_vgpr10_vgpr11_vgpr12_vgpr13_vgpr14_vgpr15
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr16_vgpr17_vgpr18_vgpr19_vgpr20_vgpr21_vgpr22_vgpr23
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgpr24_vgpr25_vgpr26_vgpr27_vgpr28_vgpr29_vgpr30_vgpr31
+  ; CHECK-NEXT:   S_NOP 0, implicit-def 
$vgp

[llvm-branch-commits] [clang] release/21.x: [clang-format] Add AfterNot to SpaceBeforeParensOptions (#150367) (PR #150457)

2025-07-24 Thread Björn Schäpers via llvm-branch-commits

https://github.com/HazardyKnusperkeks approved this pull request.


https://github.com/llvm/llvm-project/pull/150457
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)

2025-07-24 Thread Snehasish Kumar via llvm-branch-commits

https://github.com/snehasish edited 
https://github.com/llvm/llvm-project/pull/150506
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)

2025-07-24 Thread Snehasish Kumar via llvm-branch-commits

https://github.com/snehasish ready_for_review 
https://github.com/llvm/llvm-project/pull/150506
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Fix FileCheck prefix in the histogram test. (PR #150506)

2025-07-24 Thread Snehasish Kumar via llvm-branch-commits

snehasish wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/150506?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#150506** https://app.graphite.dev/github/pr/llvm/llvm-project/150506?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/150506?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#150375** https://app.graphite.dev/github/pr/llvm/llvm-project/150375?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#147854** https://app.graphite.dev/github/pr/llvm/llvm-project/147854?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/150506
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [clang-format] Fix a bug in `DerivePointerAlignment: true` (#150387) (PR #150458)

2025-07-24 Thread Björn Schäpers via llvm-branch-commits

https://github.com/HazardyKnusperkeks approved this pull request.


https://github.com/llvm/llvm-project/pull/150458
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)

2025-07-24 Thread Joel E. Denny via llvm-branch-commits

https://github.com/jdenny-ornl edited 
https://github.com/llvm/llvm-project/pull/128785
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)

2025-07-24 Thread Joel E. Denny via llvm-branch-commits


@@ -7866,6 +7866,17 @@ The attributes in this metadata is added to all followup 
loops of the
 loop distribution pass. See
 :ref:`Transformation Metadata ` for details.
 
+'``llvm.loop.estimated_trip_count``' Metadata
+
+
+This metadata records the loop's estimated trip count.  If it is not present, a
+loop's estimated trip count should be computed from any ``branch_weights``
+metadata attached to the latch block's branch instruction.
+
+Thus, this metadata frees loop transformations to compute latch branch weights
+solely for the purpose of maintaining accurate block frequencies instead of
+requiring the branch weights to always serve both roles.

jdenny-ornl wrote:

This is now part of PR #148758.

https://github.com/llvm/llvm-project/pull/128785
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)

2025-07-24 Thread Joel E. Denny via llvm-branch-commits


@@ -7866,6 +7866,17 @@ The attributes in this metadata is added to all followup 
loops of the
 loop distribution pass. See
 :ref:`Transformation Metadata ` for details.
 
+'``llvm.loop.estimated_trip_count``' Metadata

jdenny-ornl wrote:

This is now part of PR #148758.

https://github.com/llvm/llvm-project/pull/128785
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)

2025-07-24 Thread Joel E. Denny via llvm-branch-commits


@@ -7866,6 +7866,17 @@ The attributes in this metadata is added to all followup 
loops of the
 loop distribution pass. See
 :ref:`Transformation Metadata ` for details.
 
+'``llvm.loop.estimated_trip_count``' Metadata
+
+

jdenny-ornl wrote:

This is now part of PR #148758.

https://github.com/llvm/llvm-project/pull/128785
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopPeel] Fix branch weights' effect on block frequencies (PR #128785)

2025-07-24 Thread Joel E. Denny via llvm-branch-commits


@@ -850,27 +852,35 @@ llvm::getLoopEstimatedTripCount(Loop *L,
 getEstimatedTripCount(LatchBranch, L, ExitWeight)) {
   if (EstimatedLoopInvocationWeight)
 *EstimatedLoopInvocationWeight = ExitWeight;
+  if (auto EstimatedTripCount =

jdenny-ornl wrote:

I have made PR #148758 the base for this PR, which is now much simpler.

https://github.com/llvm/llvm-project/pull/128785
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] callee_type metadata for indirect calls (PR #117036)

2025-07-24 Thread Paul Kirth via llvm-branch-commits


@@ -2869,9 +2870,23 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const 
NamedDecl *ND) {
 GV->setLinkage(llvm::GlobalValue::ExternalWeakLinkage);
 }
 
+static bool hasExistingGeneralizedTypeMD(llvm::Function *F) {
+  llvm::MDNode *MD = F->getMetadata(llvm::LLVMContext::MD_type);
+  if (!MD)
+return false;
+  return MD->hasGeneralizedMDString();
+}
+
 void CodeGenModule::createFunctionTypeMetadataForIcall(const FunctionDecl *FD,
llvm::Function *F) {
-  // Only if we are checking indirect calls.
+  if (CodeGenOpts.CallGraphSection && !hasExistingGeneralizedTypeMD(F) &&
+  (!F->hasLocalLinkage() ||
+   F->getFunction().hasAddressTaken(nullptr, /*IgnoreCallbackUses=*/true,
+/*IgnoreAssumeLikeCalls=*/true,
+/*IgnoreLLVMUsed=*/false)))
+F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));

ilovepi wrote:

I think this still needs to be addressed...

https://github.com/llvm/llvm-project/pull/117036
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm][AsmPrinter] Emit call graph section (PR #87576)

2025-07-24 Thread Paul Kirth via llvm-branch-commits


@@ -1,40 +1,43 @@
 ;; Test if temporary labels are generated for each indirect callsite with a 
callee_type metadata.
-;; Test if the .callgraph section contains the numerical callee type id for 
each of the temporary 
-;; labels generated. 
+;; Test if the .callgraph section contains the MD5 hash of callee type ids 
generated from
+;; generalized type id strings.
 
 ; RUN: llc -mtriple=x86_64-unknown-linux --call-graph-section -o - < %s | 
FileCheck %s
 
 ; CHECK: ball:
-; CHECK-NEXT: .Lfunc_begin0:
+; CHECK-NEXT: [[LABEL_FUNC:\.Lfunc_begin[0-9]+]]:
 define ptr @ball() {
 entry:
   %fp_foo_val = load ptr, ptr null, align 8
-   ; CHECK: .Ltmp0:
+   ; CHECK: [[LABEL_TMP0:\.Ltmp[0-9]+]]:

ilovepi wrote:

Do you care that it's `.Ltmp`? I'd assume you just want to match anything after 
`.L`...

https://github.com/llvm/llvm-project/pull/87576
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   3   >