r374042 - [SVE][IR] Scalable Vector size queries and IR instruction support

2019-10-08 Thread Graham Hunter via cfe-commits
Author: huntergr
Date: Tue Oct  8 05:53:54 2019
New Revision: 374042

URL: http://llvm.org/viewvc/llvm-project?rev=374042&view=rev
Log:
[SVE][IR] Scalable Vector size queries and IR instruction support

* Adds a TypeSize struct to represent the known minimum size of a type
  along with a flag to indicate that the runtime size is a integer multiple
  of that size
* Converts existing size query functions from Type.h and DataLayout.h to
  return a TypeSize result
* Adds convenience methods (including a transparent conversion operator
  to uint64_t) so that most existing code 'just works' as if the return
  values were still scalars.
* Uses the new size queries along with ElementCount to ensure that all
  supported instructions used with scalable vectors can be constructed
  in IR.

Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen

Reviewed By: rovka, sdesmalen

Differential Revision: https://reviews.llvm.org/D53137

Modified:
cfe/trunk/lib/CodeGen/CGCall.cpp
cfe/trunk/lib/CodeGen/CGStmt.cpp
cfe/trunk/lib/CodeGen/CodeGenFunction.cpp

Modified: cfe/trunk/lib/CodeGen/CGCall.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.cpp?rev=374042&r1=374041&r2=374042&view=diff
==
--- cfe/trunk/lib/CodeGen/CGCall.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGCall.cpp Tue Oct  8 05:53:54 2019
@@ -4277,8 +4277,8 @@ RValue CodeGenFunction::EmitCall(const C
   // Update the largest vector width if any arguments have vector types.
   for (unsigned i = 0; i < IRCallArgs.size(); ++i) {
 if (auto *VT = dyn_cast(IRCallArgs[i]->getType()))
-  LargestVectorWidth = std::max(LargestVectorWidth,
-VT->getPrimitiveSizeInBits());
+  LargestVectorWidth = std::max((uint64_t)LargestVectorWidth,
+   
VT->getPrimitiveSizeInBits().getFixedSize());
   }
 
   // Compute the calling convention and attributes.
@@ -4361,8 +4361,8 @@ RValue CodeGenFunction::EmitCall(const C
 
   // Update largest vector width from the return type.
   if (auto *VT = dyn_cast(CI->getType()))
-LargestVectorWidth = std::max(LargestVectorWidth,
-  VT->getPrimitiveSizeInBits());
+LargestVectorWidth = std::max((uint64_t)LargestVectorWidth,
+  VT->getPrimitiveSizeInBits().getFixedSize());
 
   // Insert instrumentation or attach profile metadata at indirect call sites.
   // For more details, see the comment before the definition of

Modified: cfe/trunk/lib/CodeGen/CGStmt.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmt.cpp?rev=374042&r1=374041&r2=374042&view=diff
==
--- cfe/trunk/lib/CodeGen/CGStmt.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmt.cpp Tue Oct  8 05:53:54 2019
@@ -2073,8 +2073,8 @@ void CodeGenFunction::EmitAsmStmt(const
 
   // Update largest vector width for any vector types.
   if (auto *VT = dyn_cast(ResultRegTypes.back()))
-LargestVectorWidth = std::max(LargestVectorWidth,
-  VT->getPrimitiveSizeInBits());
+LargestVectorWidth = std::max((uint64_t)LargestVectorWidth,
+   
VT->getPrimitiveSizeInBits().getFixedSize());
 } else {
   ArgTypes.push_back(Dest.getAddress().getType());
   Args.push_back(Dest.getPointer());
@@ -2098,8 +2098,8 @@ void CodeGenFunction::EmitAsmStmt(const
 
   // Update largest vector width for any vector types.
   if (auto *VT = dyn_cast(Arg->getType()))
-LargestVectorWidth = std::max(LargestVectorWidth,
-  VT->getPrimitiveSizeInBits());
+LargestVectorWidth = std::max((uint64_t)LargestVectorWidth,
+   
VT->getPrimitiveSizeInBits().getFixedSize());
   if (Info.allowsRegister())
 InOutConstraints += llvm::utostr(i);
   else
@@ -2185,8 +2185,8 @@ void CodeGenFunction::EmitAsmStmt(const
 
 // Update largest vector width for any vector types.
 if (auto *VT = dyn_cast(Arg->getType()))
-  LargestVectorWidth = std::max(LargestVectorWidth,
-VT->getPrimitiveSizeInBits());
+  LargestVectorWidth = std::max((uint64_t)LargestVectorWidth,
+   
VT->getPrimitiveSizeInBits().getFixedSize());
 
 ArgTypes.push_back(Arg->getType());
 Args.push_back(Arg);

Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenFunction.cpp?rev=374042&r1=374041&r2=374042&view=diff
==
--- cfe/trunk/lib/CodeGen/CodeGenFunction.cpp (original)
+++ cfe/trunk/lib/CodeGen/CodeGenFunction.cpp Tue Oct  8 05:53:54 2019
@@ -431,13 +431,13 @@ void CodeGenFunction::F

[flang] [libc] [llvm] [clang-tools-extra] [clang] [lldb] [libcxx] [compiler-rt] [InstCombine] Fold converted urem to 0 if there's no overlapping bits (PR #71528)

2023-11-08 Thread Graham Hunter via cfe-commits

https://github.com/huntergr-arm updated 
https://github.com/llvm/llvm-project/pull/71528

>From 754519ad9b37343c827504e7d6bfcfa590f69483 Mon Sep 17 00:00:00 2001
From: Graham Hunter 
Date: Fri, 3 Nov 2023 14:22:57 +
Subject: [PATCH] [InstCombine] Fold converted urem to 0 if there's no
 overlapping bits

When folding urem instructions we can end up not recognizing that
the output will always be 0 due to Value*s being different, despite
generating the same data (in this case, 2 different calls to vscale).

This patch recognizes the (x << N) & (add (x << M), -1) pattern that
instcombine replaces urem with after the two vscale calls have been
reduced to one via CSE, then replaces with 0 when x is a non-zero
power of 2 and N >= M.
---
 .../InstCombine/InstCombineAndOrXor.cpp   | 10 
 .../InstCombine/po2-shift-add-and-to-zero.ll  | 52 +++
 2 files changed, 62 insertions(+)
 create mode 100644 
llvm/test/Transforms/InstCombine/po2-shift-add-and-to-zero.ll

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 46af9bf5eed003a..da38f8039dbc3ca 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2662,6 +2662,16 @@ Instruction *InstCombinerImpl::visitAnd(BinaryOperator 
&I) {
   if (sinkNotIntoOtherHandOfLogicalOp(I))
 return &I;
 
+  // (x << N) & (add (x << M), -1) --> 0, where x is known to be a non-zero
+  // power of 2 and M <= N.
+  const APInt *Shift1, *Shift2;
+  if (match(&I, m_c_And(m_OneUse(m_Shl(m_Value(X), m_APInt(Shift1))),
+m_OneUse(m_Add(m_Shl(m_Value(Y), m_APInt(Shift2)),
+   m_AllOnes() &&
+  X == Y && isKnownToBeAPowerOfTwo(X, /*OrZero*/ false, 0, &I) &&
+  Shift1->uge(*Shift2))
+return replaceInstUsesWith(I, Constant::getNullValue(I.getType()));
+
   // An and recurrence w/loop invariant step is equivelent to (and start, step)
   PHINode *PN = nullptr;
   Value *Start = nullptr, *Step = nullptr;
diff --git a/llvm/test/Transforms/InstCombine/po2-shift-add-and-to-zero.ll 
b/llvm/test/Transforms/InstCombine/po2-shift-add-and-to-zero.ll
new file mode 100644
index 000..4979e7a01972299
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/po2-shift-add-and-to-zero.ll
@@ -0,0 +1,52 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 2
+; RUN: opt -mtriple unknown -passes=instcombine -S < %s | FileCheck %s
+
+;; The and X, (add Y, -1) pattern is from an earlier instcombine pass which
+;; converted
+
+;; define dso_local i64 @f1() local_unnamed_addr #0 {
+;; entry:
+;;   %0 = call i64 @llvm.aarch64.sve.cntb(i32 31)
+;;   %1 = call i64 @llvm.aarch64.sve.cnth(i32 31)
+;;   %rem = urem i64 %0, %1
+;;   ret i64 %rem
+;; }
+
+;; into
+
+;; define dso_local i64 @f1() local_unnamed_addr #0 {
+;; entry:
+;;   %0 = call i64 @llvm.vscale.i64()
+;;   %1 = shl nuw nsw i64 %0, 4
+;;   %2 = call i64 @llvm.vscale.i64()
+;;   %3 = shl nuw nsw i64 %2, 3
+;;   %4 = add nsw i64 %3, -1
+;;   %rem = and i64 %1, %4
+;;   ret i64 %rem
+;; }
+
+;; InstCombine would have folded the original to returning 0 if the vscale
+;; calls were the same Value*, but since there's two of them it doesn't
+;; work and we convert the urem to add/and. CSE then gets rid of the extra
+;; vscale, leaving us with a new pattern to match. This only works because
+;; vscale is known to be a nonzero power of 2 (assuming there's a defined
+;; range for it).
+
+define dso_local i64 @f1() local_unnamed_addr #0 {
+; CHECK-LABEL: define dso_local i64 @f1
+; CHECK-SAME: () local_unnamed_addr #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:ret i64 0
+;
+entry:
+  %0 = call i64 @llvm.vscale.i64()
+  %1 = shl nuw nsw i64 %0, 4
+  %2 = shl nuw nsw i64 %0, 3
+  %3 = add nsw i64 %2, -1
+  %rem = and i64 %1, %3
+  ret i64 %rem
+}
+
+declare i64 @llvm.vscale.i64()
+
+attributes #0 = { vscale_range(1,16) }

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] a83ce95 - [clang] Remove unused capture in closure

2021-06-22 Thread Graham Hunter via cfe-commits

Author: Graham Hunter
Date: 2021-06-22T15:09:39+01:00
New Revision: a83ce95b097689f4ebd2c3b45c0778d0b6946d00

URL: 
https://github.com/llvm/llvm-project/commit/a83ce95b097689f4ebd2c3b45c0778d0b6946d00
DIFF: 
https://github.com/llvm/llvm-project/commit/a83ce95b097689f4ebd2c3b45c0778d0b6946d00.diff

LOG: [clang] Remove unused capture in closure

c6a91ee6 removed uses of IsMonotonic from OpenMP SIMD codegen,
but that left a capture of the variable unused which upset buildbots
using -Werror.

Added: 


Modified: 
clang/lib/CodeGen/CGStmtOpenMP.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp 
b/clang/lib/CodeGen/CGStmtOpenMP.cpp
index 3b2b70f388cc..ba497a5b9d3a 100644
--- a/clang/lib/CodeGen/CGStmtOpenMP.cpp
+++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp
@@ -3197,7 +3197,7 @@ bool CodeGenFunction::EmitOMPWorksharingLoop(
 getJumpDestInCurrentScope(createBasicBlock("omp.loop.exit"));
 emitCommonSimdLoop(
 *this, S,
-[&S, IsMonotonic](CodeGenFunction &CGF, PrePostActionTy &) {
+[&S](CodeGenFunction &CGF, PrePostActionTy &) {
   if (isOpenMPSimdDirective(S.getDirectiveKind())) {
 CGF.EmitOMPSimdInit(S);
   } else if (const auto *C = S.getSingleClause()) {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] ad49765 - [OpenMP] Allow const parameters in declare simd linear clause

2020-03-02 Thread Graham Hunter via cfe-commits

Author: Graham Hunter
Date: 2020-03-02T14:54:14Z
New Revision: ad497658d25a3616e4c57cf7d12e3497a1c66f35

URL: 
https://github.com/llvm/llvm-project/commit/ad497658d25a3616e4c57cf7d12e3497a1c66f35
DIFF: 
https://github.com/llvm/llvm-project/commit/ad497658d25a3616e4c57cf7d12e3497a1c66f35.diff

LOG: [OpenMP] Allow const parameters in declare simd linear clause

Reviewers: ABataev, kkwli0, jdoerfert, fpetrogalli

Reviewed By: ABataev, fpetrogalli

Differential Revision: https://reviews.llvm.org/D75350

Added: 


Modified: 
clang/include/clang/Sema/Sema.h
clang/lib/Sema/SemaOpenMP.cpp
clang/test/OpenMP/declare_simd_aarch64.c
clang/test/OpenMP/declare_simd_codegen.cpp

Removed: 




diff  --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 2d7676e23cfc..f1dfe411983a 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -10173,7 +10173,8 @@ class Sema final {
   /// Checks that the specified declaration matches requirements for the linear
   /// decls.
   bool CheckOpenMPLinearDecl(const ValueDecl *D, SourceLocation ELoc,
- OpenMPLinearClauseKind LinKind, QualType Type);
+ OpenMPLinearClauseKind LinKind, QualType Type,
+ bool IsDeclareSimd = false);
 
   /// Called on well-formed '\#pragma omp declare simd' after parsing of
   /// the associated method/function.

diff  --git a/clang/lib/Sema/SemaOpenMP.cpp b/clang/lib/Sema/SemaOpenMP.cpp
index 6aaf70918d02..de732577c81b 100644
--- a/clang/lib/Sema/SemaOpenMP.cpp
+++ b/clang/lib/Sema/SemaOpenMP.cpp
@@ -5269,7 +5269,8 @@ Sema::DeclGroupPtrTy 
Sema::ActOnOpenMPDeclareSimdDirective(
   E->containsUnexpandedParameterPack())
 continue;
   (void)CheckOpenMPLinearDecl(CanonPVD, E->getExprLoc(), LinKind,
-  PVD->getOriginalType());
+  PVD->getOriginalType(),
+  /*IsDeclareSimd=*/true);
   continue;
 }
   }
@@ -5289,7 +5290,7 @@ Sema::DeclGroupPtrTy 
Sema::ActOnOpenMPDeclareSimdDirective(
   E->isInstantiationDependent() || 
E->containsUnexpandedParameterPack())
 continue;
   (void)CheckOpenMPLinearDecl(/*D=*/nullptr, E->getExprLoc(), LinKind,
-  E->getType());
+  E->getType(), /*IsDeclareSimd=*/true);
   continue;
 }
 Diag(E->getExprLoc(), diag::err_omp_param_or_this_in_clause)
@@ -14547,8 +14548,8 @@ bool 
Sema::CheckOpenMPLinearModifier(OpenMPLinearClauseKind LinKind,
 }
 
 bool Sema::CheckOpenMPLinearDecl(const ValueDecl *D, SourceLocation ELoc,
- OpenMPLinearClauseKind LinKind,
- QualType Type) {
+ OpenMPLinearClauseKind LinKind, QualType Type,
+ bool IsDeclareSimd) {
   const auto *VD = dyn_cast_or_null(D);
   // A variable must not have an incomplete type or a reference type.
   if (RequireCompleteType(ELoc, Type, diag::err_omp_linear_incomplete_type))
@@ -14564,8 +14565,10 @@ bool Sema::CheckOpenMPLinearDecl(const ValueDecl *D, 
SourceLocation ELoc,
   // OpenMP 5.0 [2.19.3, List Item Privatization, Restrictions]
   // A variable that is privatized must not have a const-qualified type
   // unless it is of class type with a mutable member. This restriction does
-  // not apply to the firstprivate clause.
-  if (rejectConstNotMutableType(*this, D, Type, OMPC_linear, ELoc))
+  // not apply to the firstprivate clause, nor to the linear clause on
+  // declarative directives (like declare simd).
+  if (!IsDeclareSimd &&
+  rejectConstNotMutableType(*this, D, Type, OMPC_linear, ELoc))
 return true;
 
   // A list item must be of integral or pointer type.

diff  --git a/clang/test/OpenMP/declare_simd_aarch64.c 
b/clang/test/OpenMP/declare_simd_aarch64.c
index eff0eed07dfe..4af2ad9bb603 100644
--- a/clang/test/OpenMP/declare_simd_aarch64.c
+++ b/clang/test/OpenMP/declare_simd_aarch64.c
@@ -116,6 +116,15 @@ double c02(double *x, char y);
 // AARCH64: "_ZGVnM16uv_c02" "_ZGVnM8uv_c02"
 // AARCH64-NOT: c02
 
+//
+/* Linear with a constant parameter */
+//
+
+#pragma omp declare simd notinbranch linear(i)
+double constlinear(const int i);
+// AARCH64: "_ZGVnN2l_constlinear" "_ZGVnN4l_constlinear"
+// AARCH64-NOT: constlinear
+
 /*/
 /* sincos-like signature */
 /*/
@@ -170,6 +179,7 @@ void do_something() {
   D = b03(D);
   *I = c01(D, *S);
   *D = c02(D, *S);
+  constlinear(*I);
   sincos(*D, D, D);
   SinCos(*D, D, D);
   foo2(I, *I);

diff  --git a/clang/test/OpenMP/declare_simd_codegen.cpp 
b/clang/test/OpenMP/declare_simd_co

[clang] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-09-03 Thread Graham Hunter via cfe-commits


@@ -2778,6 +2808,60 @@ void VPWidenPointerInductionRecipe::print(raw_ostream 
&O, const Twine &Indent,
 }
 #endif
 
+void VPAliasLaneMaskRecipe::execute(VPTransformState &State) {
+  IRBuilderBase Builder = State.Builder;
+  Value *SinkValue = State.get(getSinkValue(), 0, true);
+  Value *SourceValue = State.get(getSourceValue(), 0, true);
+
+  unsigned ElementSize = 0;
+  auto *ReadInsn = cast(SourceValue);
+  auto *ReadCast = dyn_cast(SourceValue);
+  if (ReadInsn->getOpcode() == Instruction::Add)
+ReadCast = dyn_cast(ReadInsn->getOperand(0));
+
+  if (ReadCast && ReadCast->getOpcode() == Instruction::PtrToInt) {
+Value *Ptr = ReadCast->getOperand(0);
+for (auto *Use : Ptr->users()) {
+  if (auto *GEP = dyn_cast(Use)) {
+auto *EltVT = GEP->getSourceElementType();
+if (EltVT->isArrayTy())
+  ElementSize = EltVT->getArrayElementType()->getScalarSizeInBits() *
+EltVT->getArrayNumElements();
+else
+  ElementSize = GEP->getSourceElementType()->getScalarSizeInBits() / 8;
+break;
+  }
+}
+  }
+  assert(ElementSize > 0 && "Couldn't get element size from pointer");

huntergr-arm wrote:

This assert fires when compiling 
`MicroBenchmarks/ImageProcessing/Dilate/CMakeFiles/Dilate.dir/dilateKernel.c` 
from the test suite, so that needs fixing at least.

But I think we shouldn't be trying to figure out the element size at recipe 
execution time; it should be determined when trying to build the recipe and the 
value stored as part of the class. If we can't find the size then we stop 
trying to build the recipe and fall back to normal RT checks. Does that sound 
sensible to you?

https://github.com/llvm/llvm-project/pull/100579
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AArch64][SVE] Refactor getPTrue to return splat(1) when pattern=all. (PR #139236)

2025-05-09 Thread Graham Hunter via cfe-commits


@@ -25030,7 +25030,8 @@ static SDValue foldCSELofLASTB(SDNode *Op, SelectionDAG 
&DAG) {
   if (AnyPred.getOpcode() == AArch64ISD::REINTERPRET_CAST)
 AnyPred = AnyPred.getOperand(0);
 
-  if (TruePred != AnyPred && TruePred.getOpcode() != AArch64ISD::PTRUE)
+  if (TruePred != AnyPred && TruePred.getOpcode() != AArch64ISD::PTRUE &&
+  !ISD::isConstantSplatVectorAllOnes(TruePred.getNode()))

huntergr-arm wrote:

I agree with Paul that we can remove the opcode check; I have a vague memory of 
looking for a helper function to determine whether the mask was all-true (or 
better, whether the ptest was redundant), not finding one, and just sticking a 
check for PTRUE in to get my tests up and running. Sadly I didn't revisit that 
before committing.

https://github.com/llvm/llvm-project/pull/139236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [LLVM][IRBuilder] Use NUW arithmetic for Create{ElementCount,TypeSize}. (PR #143532)

2025-06-10 Thread Graham Hunter via cfe-commits

https://github.com/huntergr-arm approved this pull request.

LGTM; I do think it's worth documenting the assumption/requirement about a 
sufficiently large type in the comments on the declarations for 
CreateElementCount and CreateTypeSize.

https://github.com/llvm/llvm-project/pull/143532
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits