[llvm-branch-commits] [clang] release/19.x: [Clang] Demote always_inline error to warning for mismatching SME attrs (#100740) (PR #100987)

2024-07-29 Thread Sander de Smalen via llvm-branch-commits

https://github.com/sdesmalen-arm requested changes to this pull request.

For some odd reason, `clang/test/CodeGen/aarch64-sme-inline-streaming-attrs.c` 
seems to be failing on some buildbots with an error that says:
> `unable to create target: No available targets are compatible with triple 
> "aarch64-none-linux-gnu"'`.  I suspect because the test is missing a 
> `REQUIRES: aarch64-registered-target`, but I'm puzzled why that test didn't 
> fail before, because my patch doesn't introduce this test and doesn't change 
> the RUN lines, all this patch does is change one of the diagnostic messages.

In any case, I seemed to have jumped the gun creating this cherry-pick in the 
first place, I thought the change was trivial enough to do so especially after 
testing locally. My apologies for the noise.

I'll revert the patch and fix the issue, and will then create another 
cherry-pick.

https://github.com/llvm/llvm-project/pull/100987
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_get_current_vg to preserve required registers (#100143) (PR #100546)

2024-07-25 Thread Sander de Smalen via llvm-branch-commits

sdesmalen-arm wrote:

It would be great if we could merge this fix into the release branch!

https://github.com/llvm/llvm-project/pull/100546
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64] Improve cost model for legal subvec insert/extract (PR #81135)

2024-02-15 Thread Sander de Smalen via llvm-branch-commits


@@ -568,6 +568,48 @@ AArch64TTIImpl::getIntrinsicInstrCost(const 
IntrinsicCostAttributes ,
 }
 return Cost;
   }
+  case Intrinsic::vector_extract: {
+// If both the vector argument and the return type are legal types and the
+// index is 0, then this should be a no-op or simple operation; return a
+// relatively low cost.
+
+// If arguments aren't actually supplied, then we cannot determine the
+// value of the index.
+if (ICA.getArgs().size() < 2)
+  break;
+LLVMContext  = RetTy->getContext();
+EVT MRTy = getTLI()->getValueType(DL, RetTy);
+EVT MPTy = getTLI()->getValueType(DL, ICA.getArgTypes()[0]);
+TargetLoweringBase::LegalizeKind RLK = getTLI()->getTypeConversion(C, 
MRTy);
+TargetLoweringBase::LegalizeKind PLK = getTLI()->getTypeConversion(C, 
MPTy);
+const ConstantInt *Idx = dyn_cast(ICA.getArgs()[1]);
+if (RLK.first == TargetLoweringBase::TypeLegal &&
+PLK.first == TargetLoweringBase::TypeLegal && Idx &&
+Idx->getZExtValue() == 0)
+  return InstructionCost(1);

sdesmalen-arm wrote:

Is there a reason this wouldn't this be zero-cost?

Also, stylistically to match the rest of this file, maybe return 
`TTI::TCC_Free` (if this is considered a cost of 0) or `TTI::TCC_Basic` (if 
this is considered a cost of 1) instead?

https://github.com/llvm/llvm-project/pull/81135
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64] Improve cost model for legal subvec insert/extract (PR #81135)

2024-02-15 Thread Sander de Smalen via llvm-branch-commits


@@ -568,6 +568,48 @@ AArch64TTIImpl::getIntrinsicInstrCost(const 
IntrinsicCostAttributes ,
 }
 return Cost;
   }
+  case Intrinsic::vector_extract: {
+// If both the vector argument and the return type are legal types and the
+// index is 0, then this should be a no-op or simple operation; return a
+// relatively low cost.
+
+// If arguments aren't actually supplied, then we cannot determine the
+// value of the index.
+if (ICA.getArgs().size() < 2)

sdesmalen-arm wrote:

nit:
```suggestion
if (ICA.getArgs().size() != 2)
```

https://github.com/llvm/llvm-project/pull/81135
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64] Improve cost model for legal subvec insert/extract (PR #81135)

2024-02-15 Thread Sander de Smalen via llvm-branch-commits


@@ -568,6 +568,32 @@ AArch64TTIImpl::getIntrinsicInstrCost(const 
IntrinsicCostAttributes ,
 }
 return Cost;
   }
+  case Intrinsic::vector_extract: {
+// If both the vector argument and the return type are legal types, then
+// this should be a no-op or simple operation; return a relatively low 
cost.
+LLVMContext  = RetTy->getContext();
+EVT MRTy = getTLI()->getValueType(DL, RetTy);
+EVT MPTy = getTLI()->getValueType(DL, ICA.getArgTypes()[0]);
+TargetLoweringBase::LegalizeKind RLK = getTLI()->getTypeConversion(C, 
MRTy);
+TargetLoweringBase::LegalizeKind PLK = getTLI()->getTypeConversion(C, 
MPTy);
+if (RLK.first == TargetLoweringBase::TypeLegal &&
+PLK.first == TargetLoweringBase::TypeLegal)
+  return InstructionCost(1);

sdesmalen-arm wrote:

Just pointing out that the code isn't updated yet to handle predicates 
differently, as those inserts/extracts are indeed not free.

https://github.com/llvm/llvm-project/pull/81135
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/18.x: [AArch64][SME] Implement inline-asm clobbers for za/zt0 (#79276) (PR #81616)

2024-02-14 Thread Sander de Smalen via llvm-branch-commits

https://github.com/sdesmalen-arm approved this pull request.

Looks pretty low-risk to me and would be nice to get into the release if we can.

(how is this PR different from #81593?)

https://github.com/llvm/llvm-project/pull/81616
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/18.x: [AArch64][SME] Implement inline-asm clobbers for za/zt0 (#79276) (PR #81593)

2024-02-13 Thread Sander de Smalen via llvm-branch-commits

https://github.com/sdesmalen-arm approved this pull request.

Looks pretty low-risk to me and would be nice to get into the release if we can.

https://github.com/llvm/llvm-project/pull/81593
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 329fda3 - NFC: Mention auto-vec support for SVE in release notes.

2022-03-15 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2022-03-14T09:44:55Z
New Revision: 329fda39c507e8740978d10458451dcdb21563be

URL: 
https://github.com/llvm/llvm-project/commit/329fda39c507e8740978d10458451dcdb21563be
DIFF: 
https://github.com/llvm/llvm-project/commit/329fda39c507e8740978d10458451dcdb21563be.diff

LOG: NFC: Mention auto-vec support for SVE in release notes.

Added: 


Modified: 
llvm/docs/ReleaseNotes.rst

Removed: 




diff  --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index e8934f79181a7..b2d8d8c2640e2 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -82,6 +82,7 @@ Changes to the AArch64 Backend
   "tune-cpu" attribute is absent it tunes according to the "target-cpu".
 * Fixed relocations against temporary symbols (e.g. in jump tables and
   constant pools) in large COFF object files.
+* Auto-vectorization now targets SVE by default when available.
 
 Changes to the ARM Backend
 --



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 171d124 - [SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost.

2021-01-25 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-25T12:27:01Z
New Revision: 171d12489f20818e292362342b5665c689073ad2

URL: 
https://github.com/llvm/llvm-project/commit/171d12489f20818e292362342b5665c689073ad2
DIFF: 
https://github.com/llvm/llvm-project/commit/171d12489f20818e292362342b5665c689073ad2.diff

LOG: [SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost.

This change also changes getReductionCost to return InstructionCost,
and it simplifies two expressions by removing a redundant 'isValid' check.

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 7114b4d412fd..0b630197911a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -3411,21 +3411,21 @@ bool BoUpSLP::areAllUsersVectorized(Instruction *I) 
const {
  });
 }
 
-static std::pair
+static std::pair
 getVectorCallCosts(CallInst *CI, FixedVectorType *VecTy,
TargetTransformInfo *TTI, TargetLibraryInfo *TLI) {
   Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI);
 
   // Calculate the cost of the scalar and vector calls.
   IntrinsicCostAttributes CostAttrs(ID, *CI, VecTy->getElementCount());
-  int IntrinsicCost =
+  auto IntrinsicCost =
 TTI->getIntrinsicInstrCost(CostAttrs, TTI::TCK_RecipThroughput);
 
   auto Shape = VFShape::get(*CI, ElementCount::getFixed(static_cast(
  VecTy->getNumElements())),
 false /*HasGlobalPred*/);
   Function *VecFunc = VFDatabase(*CI).getVectorizedFunction(Shape);
-  int LibCost = IntrinsicCost;
+  auto LibCost = IntrinsicCost;
   if (!CI->isNoBuiltin() && VecFunc) {
 // Calculate the cost of the vector library call.
 SmallVector VecTys;
@@ -5994,7 +5994,7 @@ bool 
SLPVectorizerPass::vectorizeStoreChain(ArrayRef Chain, BoUpSLP ,
   InstructionCost Cost = R.getTreeCost();
 
   LLVM_DEBUG(dbgs() << "SLP: Found cost = " << Cost << " for VF =" << VF << 
"\n");
-  if (Cost.isValid() && Cost < -SLPCostThreshold) {
+  if (Cost < -SLPCostThreshold) {
 LLVM_DEBUG(dbgs() << "SLP: Decided to vectorize cost = " << Cost << "\n");
 
 using namespace ore;
@@ -6295,7 +6295,7 @@ bool SLPVectorizerPass::tryToVectorizeList(ArrayRef VL, BoUpSLP ,
 
   MinCost = std::min(MinCost, Cost);
 
-  if (Cost.isValid() && Cost < -SLPCostThreshold) {
+  if (Cost < -SLPCostThreshold) {
 LLVM_DEBUG(dbgs() << "SLP: Vectorizing list at cost:" << Cost << 
".\n");
 R.getORE()->emit(OptimizationRemark(SV_NAME, "VectorizedList",
 cast(Ops[0]))
@@ -7007,11 +7007,12 @@ class HorizontalReduction {
 
 private:
   /// Calculate the cost of a reduction.
-  int getReductionCost(TargetTransformInfo *TTI, Value *FirstReducedVal,
-   unsigned ReduxWidth) {
+  InstructionCost getReductionCost(TargetTransformInfo *TTI,
+   Value *FirstReducedVal,
+   unsigned ReduxWidth) {
 Type *ScalarTy = FirstReducedVal->getType();
 FixedVectorType *VectorTy = FixedVectorType::get(ScalarTy, ReduxWidth);
-int VectorCost, ScalarCost;
+InstructionCost VectorCost, ScalarCost;
 switch (RdxKind) {
 case RecurKind::Add:
 case RecurKind::Mul:



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d196f9e - [InstructionCost] Prevent InstructionCost being created with CostState.

2021-01-25 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-25T11:26:56Z
New Revision: d196f9e2fca3ff767aa7d2dcaf4654724a79e18c

URL: 
https://github.com/llvm/llvm-project/commit/d196f9e2fca3ff767aa7d2dcaf4654724a79e18c
DIFF: 
https://github.com/llvm/llvm-project/commit/d196f9e2fca3ff767aa7d2dcaf4654724a79e18c.diff

LOG: [InstructionCost] Prevent InstructionCost being created with CostState.

For a function that returns InstructionCost, it is very tempting to write:

  return InstructionCost::Invalid;

But that actually returns InstructionCost(1 /* int value of Invalid */))
which has a totally different meaning. By marking this constructor as
`delete`, this can no longer happen.

Added: 


Modified: 
llvm/include/llvm/Support/InstructionCost.h

Removed: 




diff  --git a/llvm/include/llvm/Support/InstructionCost.h 
b/llvm/include/llvm/Support/InstructionCost.h
index 725f8495ac09..fbc898b878bb 100644
--- a/llvm/include/llvm/Support/InstructionCost.h
+++ b/llvm/include/llvm/Support/InstructionCost.h
@@ -47,6 +47,7 @@ class InstructionCost {
 public:
   InstructionCost() = default;
 
+  InstructionCost(CostState) = delete;
   InstructionCost(CostType Val) : Value(Val), State(Valid) {}
 
   static InstructionCost getInvalid(CostType Val = 0) {



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] c8a914d - [LiveDebugValues] Fix comparison operator in VarLocBasedImpl

2021-01-12 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-12T08:44:58Z
New Revision: c8a914db5c60dbeb5b638f30a9915855a67805f7

URL: 
https://github.com/llvm/llvm-project/commit/c8a914db5c60dbeb5b638f30a9915855a67805f7
DIFF: 
https://github.com/llvm/llvm-project/commit/c8a914db5c60dbeb5b638f30a9915855a67805f7.diff

LOG: [LiveDebugValues] Fix comparison operator in VarLocBasedImpl

The issue was introduced in commit rG84a1120943a651184bae507fed5d648fee381ae4
and would cause a VarLoc's StackOffset to be compared with its own, instead of
the StackOffset from the other VarLoc. This patch fixes that.

Added: 


Modified: 
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp 
b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
index 4811b8046797..e2daa46fe6b9 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
+++ b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
@@ -572,8 +572,9 @@ class VarLocBasedLDV : public LDVImpl {
Expr) <
std::make_tuple(
Other.Var, Other.Kind, Other.Loc.SpillLocation.SpillBase,
-   Loc.SpillLocation.SpillOffset.getFixed(),
-   Loc.SpillLocation.SpillOffset.getScalable(), Other.Expr);
+   Other.Loc.SpillLocation.SpillOffset.getFixed(),
+   Other.Loc.SpillLocation.SpillOffset.getScalable(),
+   Other.Expr);
   case RegisterKind:
   case ImmediateKind:
   case EntryValueKind:



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] aa280c9 - [AArch64][SVE] Emit DWARF location expr for SVE (dbg.declare)

2021-01-06 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-06T11:45:05Z
New Revision: aa280c99f708dca9dea96bc9070d6194d2622529

URL: 
https://github.com/llvm/llvm-project/commit/aa280c99f708dca9dea96bc9070d6194d2622529
DIFF: 
https://github.com/llvm/llvm-project/commit/aa280c99f708dca9dea96bc9070d6194d2622529.diff

LOG: [AArch64][SVE] Emit DWARF location expr for SVE (dbg.declare)

When using dbg.declare, the debug-info is generated from a list of
locals rather than through DBG_VALUE instructions in the MIR.
This patch is different from D90020 because it emits the DWARF
location expressions from that list of locals directly.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D90044

Added: 
llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir

Modified: 
llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp 
b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 02791f2280d2..ea279e4914b0 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -739,11 +739,10 @@ DIE *DwarfCompileUnit::constructVariableDIEImpl(const 
DbgVariable ,
 TFI->getFrameIndexReference(*Asm->MF, Fragment.FI, FrameReg);
 DwarfExpr.addFragmentOffset(Expr);
 
-assert(!Offset.getScalable() &&
-   "Frame offsets with a scalable component are not supported");
-
+auto *TRI = Asm->MF->getSubtarget().getRegisterInfo();
 SmallVector Ops;
-DIExpression::appendOffset(Ops, Offset.getFixed());
+TRI->getOffsetOpcodes(Offset, Ops);
+
 // According to
 // 
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf
 // cuda-gdb requires DW_AT_address_class for all variables to be able to

diff  --git a/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir 
b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir
new file mode 100644
index ..39b11ef7bfea
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir
@@ -0,0 +1,222 @@
+# RUN: llc -o %t -filetype=obj -start-before=prologepilog %s
+# RUN: llvm-dwarfdump --name="z0" %t | FileCheck %s --check-prefix=CHECKZ0
+# RUN: llvm-dwarfdump --name="z1" %t | FileCheck %s --check-prefix=CHECKZ1
+# RUN: llvm-dwarfdump --name="p0" %t | FileCheck %s --check-prefix=CHECKP0
+# RUN: llvm-dwarfdump --name="p1" %t | FileCheck %s --check-prefix=CHECKP1
+# RUN: llvm-dwarfdump --name="localv0" %t | FileCheck %s 
--check-prefix=CHECKLV0
+# RUN: llvm-dwarfdump --name="localv1" %t | FileCheck %s 
--check-prefix=CHECKLV1
+# RUN: llvm-dwarfdump --name="localp0" %t | FileCheck %s 
--check-prefix=CHECKLP0
+# RUN: llvm-dwarfdump --name="localp1" %t | FileCheck %s 
--check-prefix=CHECKLP1
+#
+# CHECKZ0:   DW_AT_location(DW_OP_fbreg +0, DW_OP_lit8, DW_OP_bregx 
VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKZ0-NEXT:  DW_AT_name("z0")
+# CHECKZ1:   DW_AT_location(DW_OP_fbreg +0, DW_OP_lit16, DW_OP_bregx 
VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKZ1-NEXT:  DW_AT_name("z1")
+# CHECKP0:   DW_AT_location(DW_OP_fbreg +0, DW_OP_lit17, DW_OP_bregx 
VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKP0-NEXT:  DW_AT_name("p0")
+# CHECKP1:   DW_AT_location(DW_OP_fbreg +0, DW_OP_lit18, DW_OP_bregx 
VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKP1-NEXT:  DW_AT_name("p1")
+# CHECKLV0:  DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x20, 
DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKLV0-NEXT: DW_AT_name("localv0")
+# CHECKLV1:  DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x28, 
DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKLV1-NEXT: DW_AT_name("localv1")
+# CHECKLP0:  DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x29, 
DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKLP0-NEXT: DW_AT_name("localp0")
+# CHECKLP1:  DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x2a, 
DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus)
+# CHECKLP1-NEXT: DW_AT_name("localp1")
+--- |
+  ; ModuleID = 't.c'
+  source_filename = "t.c"
+  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+  target triple = "aarch64-unknown-linux-gnu"
+
+  ; Function Attrs: noinline nounwind optnone
+  define dso_local  @foo( %z0,  %z1,  %p0,  %p1, i32 %w0) #0 !dbg 
!11 {
+  entry:
+%z0.addr = alloca , align 16
+%z1.addr = alloca , align 16
+%p0.addr = alloca , align 2
+%p1.addr = alloca , align 2
+%w0.addr = alloca i32, align 4
+%local_gpr0 = alloca i32, align 4
+%localv0 = alloca , align 16
+%localv1 = alloca , align 16
+%localp0 = alloca , align 2
+%localp1 = alloca , align 2
+store  %z0, * %z0.addr, align 16
+call void @llvm.dbg.declare(metadata * %z0.addr, 
metadata !29, metadata !DIExpression()), !dbg !30
+store  %z1, * %z1.addr, align 16
+call 

[llvm-branch-commits] [llvm] 84a1120 - [LiveDebugValues] Handle spill locations with a fixed and scalable component.

2021-01-06 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-06T11:30:13Z
New Revision: 84a1120943a651184bae507fed5d648fee381ae4

URL: 
https://github.com/llvm/llvm-project/commit/84a1120943a651184bae507fed5d648fee381ae4
DIFF: 
https://github.com/llvm/llvm-project/commit/84a1120943a651184bae507fed5d648fee381ae4.diff

LOG: [LiveDebugValues] Handle spill locations with a fixed and scalable 
component.

This patch fixes the two LiveDebugValues implementations
(InstrRef/VarLoc)Based to handle cases where the StackOffset contains
both a fixed and scalable component.

This depends on the `TargetRegisterInfo::prependOffsetExpression` being
added in D90020. Feel free to leave comments on that patch if you have them.

Reviewed By: djtodoro, jmorse

Differential Revision: https://reviews.llvm.org/D90046

Added: 
llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir

Modified: 
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp 
b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
index 04ead18cc3de2..b6f46daf8bba9 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
@@ -182,6 +182,7 @@
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/Debug.h"
+#include "llvm/Support/TypeSize.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
 #include 
@@ -221,14 +222,16 @@ namespace {
 // an offset.
 struct SpillLoc {
   unsigned SpillBase;
-  int SpillOffset;
+  StackOffset SpillOffset;
   bool operator==(const SpillLoc ) const {
-return std::tie(SpillBase, SpillOffset) ==
-   std::tie(Other.SpillBase, Other.SpillOffset);
+return std::make_pair(SpillBase, SpillOffset) ==
+   std::make_pair(Other.SpillBase, Other.SpillOffset);
   }
   bool operator<(const SpillLoc ) const {
-return std::tie(SpillBase, SpillOffset) <
-   std::tie(Other.SpillBase, Other.SpillOffset);
+return std::make_tuple(SpillBase, SpillOffset.getFixed(),
+SpillOffset.getScalable()) <
+   std::make_tuple(Other.SpillBase, Other.SpillOffset.getFixed(),
+Other.SpillOffset.getScalable());
   }
 };
 
@@ -769,8 +772,10 @@ class MLocTracker {
 } else if (LocIdxToLocID[*MLoc] >= NumRegs) {
   unsigned LocID = LocIdxToLocID[*MLoc];
   const SpillLoc  = SpillLocs[LocID - NumRegs + 1];
-  Expr = DIExpression::prepend(Expr, DIExpression::ApplyOffset,
-   Spill.SpillOffset);
+
+  auto *TRI = MF.getSubtarget().getRegisterInfo();
+  Expr = TRI->prependOffsetExpression(Expr, DIExpression::ApplyOffset,
+  Spill.SpillOffset);
   unsigned Base = Spill.SpillBase;
   MIB.addReg(Base, RegState::Debug);
   MIB.addImm(0);
@@ -1579,9 +1584,7 @@ InstrRefBasedLDV::extractSpillBaseRegAndOffset(const 
MachineInstr ) {
   const MachineBasicBlock *MBB = MI.getParent();
   Register Reg;
   StackOffset Offset = TFI->getFrameIndexReference(*MBB->getParent(), FI, Reg);
-  assert(!Offset.getScalable() &&
- "Frame offsets with a scalable component are not supported");
-  return {Reg, static_cast(Offset.getFixed())};
+  return {Reg, Offset};
 }
 
 /// End all previous ranges related to @MI and start a new range from @MI

diff  --git a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp 
b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
index ed7f04e571acc..4811b80467973 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
+++ b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
@@ -145,6 +145,7 @@
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/Debug.h"
+#include "llvm/Support/TypeSize.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Target/TargetMachine.h"
 #include 
@@ -292,7 +293,7 @@ class VarLocBasedLDV : public LDVImpl {
 // register and an offset.
 struct SpillLoc {
   unsigned SpillBase;
-  int SpillOffset;
+  StackOffset SpillOffset;
   bool operator==(const SpillLoc ) const {
 return SpillBase == Other.SpillBase && SpillOffset == 
Other.SpillOffset;
   }
@@ -323,21 +324,20 @@ class VarLocBasedLDV : public LDVImpl {
 
 /// The value location. Stored separately to avoid repeatedly
 /// extracting it from MI.
-union {
+union LocUnion {
   uint64_t RegNo;
   SpillLoc SpillLocation;
   uint64_t Hash;
   int64_t Immediate;
   const ConstantFP *FPImm;
   const ConstantInt *CImm;
+  LocUnion() : Hash(0) {}
 } Loc;
 
 VarLoc(const MachineInstr , LexicalScopes )
 : Var(MI.getDebugVariable(), MI.getDebugExpression(),
   MI.getDebugLoc()->getInlinedAt()),
   

[llvm-branch-commits] [llvm] e4cda13 - Fix test failure in a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf

2021-01-06 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-06T10:43:48Z
New Revision: e4cda13d5a54a8c6366e4ca82d74265e68bbb3f5

URL: 
https://github.com/llvm/llvm-project/commit/e4cda13d5a54a8c6366e4ca82d74265e68bbb3f5
DIFF: 
https://github.com/llvm/llvm-project/commit/e4cda13d5a54a8c6366e4ca82d74265e68bbb3f5.diff

LOG: Fix test failure in a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf

Set the target-triple to aarch64 in debug-info-sve-dbg-value.mir
to avoid "'+sve' is not a recognized feature for this target"
diagnostic.

Added: 


Modified: 
llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir

Removed: 




diff  --git a/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir 
b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir
index ffce40c9c4f4..84d34ce3d2ac 100644
--- a/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir
+++ b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir
@@ -27,6 +27,7 @@
 --- |
   ; ModuleID = 'bla.mir'
   source_filename = "bla.mir"
+  target triple = "aarch64-unknown-linux-gnu"
   target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
 
   define void @foo() #0 !dbg !5 {



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a7e3339 - [AArch64][SVE] Emit DWARF location expression for SVE stack objects.

2021-01-06 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-06T09:40:53Z
New Revision: a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf

URL: 
https://github.com/llvm/llvm-project/commit/a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf
DIFF: 
https://github.com/llvm/llvm-project/commit/a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf.diff

LOG: [AArch64][SVE] Emit DWARF location expression for SVE stack objects.

Extend PEI to emit a DWARF expression for StackOffsets that have
a fixed and scalable component. This means the expression that needs
to be added is either:
   + offset
or:
   + offset + scalable_offset * scalereg

where for SVE, the scale reg is the Vector Granule Dwarf register, which
encodes the number of 64bit 'granules' in an SVE vector and which
the debugger can evaluate at runtime.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D90020

Added: 
llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir

Modified: 
llvm/include/llvm/CodeGen/TargetRegisterInfo.h
llvm/lib/CodeGen/PrologEpilogInserter.cpp
llvm/lib/CodeGen/TargetRegisterInfo.cpp
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
llvm/lib/Target/AArch64/AArch64RegisterInfo.h

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h 
b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
index de2c1b069784..6f32729a1e83 100644
--- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
@@ -34,6 +34,7 @@
 namespace llvm {
 
 class BitVector;
+class DIExpression;
 class LiveRegMatrix;
 class MachineFunction;
 class MachineInstr;
@@ -923,6 +924,15 @@ class TargetRegisterInfo : public MCRegisterInfo {
 llvm_unreachable("isFrameOffsetLegal does not exist on this target");
   }
 
+  /// Gets the DWARF expression opcodes for \p Offset.
+  virtual void getOffsetOpcodes(const StackOffset ,
+SmallVectorImpl ) const;
+
+  /// Prepends a DWARF expression for \p Offset to DIExpression \p Expr.
+  DIExpression *
+  prependOffsetExpression(const DIExpression *Expr, unsigned PrependFlags,
+  const StackOffset ) const;
+
   /// Spill the register so it can be used by the register scavenger.
   /// Return true if the register was spilled, false otherwise.
   /// If this function does not spill the register, the scavenger

diff  --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp 
b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
index 7c38b193a980..65b2165bf2a0 100644
--- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp
+++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp
@@ -1211,8 +1211,6 @@ void PEI::replaceFrameIndices(MachineBasicBlock *BB, 
MachineFunction ,
 
 StackOffset Offset =
 TFI->getFrameIndexReference(MF, FrameIdx, Reg);
-assert(!Offset.getScalable() &&
-   "Frame offsets with a scalable component are not supported");
 MI.getOperand(0).ChangeToRegister(Reg, false /*isDef*/);
 MI.getOperand(0).setIsDebug();
 
@@ -1238,7 +1236,8 @@ void PEI::replaceFrameIndices(MachineBasicBlock *BB, 
MachineFunction ,
   // Make the DBG_VALUE direct.
   MI.getDebugOffset().ChangeToRegister(0, false);
 }
-DIExpr = DIExpression::prepend(DIExpr, PrependFlags, 
Offset.getFixed());
+
+DIExpr = TRI.prependOffsetExpression(DIExpr, PrependFlags, Offset);
 MI.getDebugExpressionOp().setMetadata(DIExpr);
 continue;
   }

diff  --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp 
b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index e89353c9ad27..4a190c9f50af 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -26,6 +26,7 @@
 #include "llvm/CodeGen/VirtRegMap.h"
 #include "llvm/Config/llvm-config.h"
 #include "llvm/IR/Attributes.h"
+#include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/Function.h"
 #include "llvm/MC/MCRegisterInfo.h"
 #include "llvm/Support/CommandLine.h"
@@ -532,6 +533,31 @@ TargetRegisterInfo::lookThruCopyLike(Register SrcReg,
   }
 }
 
+void TargetRegisterInfo::getOffsetOpcodes(
+const StackOffset , SmallVectorImpl ) const {
+  assert(!Offset.getScalable() && "Scalable offsets are not handled");
+  DIExpression::appendOffset(Ops, Offset.getFixed());
+}
+
+DIExpression *
+TargetRegisterInfo::prependOffsetExpression(const DIExpression *Expr,
+unsigned PrependFlags,
+const StackOffset ) const {
+  assert((PrependFlags &
+  ~(DIExpression::DerefBefore | DIExpression::DerefAfter |
+DIExpression::StackValue | DIExpression::EntryValue)) == 0 &&
+ "Unsupported prepend flag");
+  SmallVector OffsetExpr;
+  if (PrependFlags & DIExpression::DerefBefore)
+OffsetExpr.push_back(dwarf::DW_OP_deref);
+  getOffsetOpcodes(Offset, OffsetExpr);
+  if (PrependFlags & DIExpression::DerefAfter)
+

[llvm-branch-commits] [llvm] a9f5e43 - [AArch64] Use faddp to implement fadd reductions.

2021-01-06 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2021-01-06T09:36:51Z
New Revision: a9f5e4375b36e5316b8d6f9731be6bfa5a70e276

URL: 
https://github.com/llvm/llvm-project/commit/a9f5e4375b36e5316b8d6f9731be6bfa5a70e276
DIFF: 
https://github.com/llvm/llvm-project/commit/a9f5e4375b36e5316b8d6f9731be6bfa5a70e276.diff

LOG: [AArch64] Use faddp to implement fadd reductions.

Custom-expand legal VECREDUCE_FADD SDNodes
to benefit from pair-wise faddp instructions.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D59259

Added: 


Modified: 
llvm/include/llvm/Target/TargetSelectionDAG.td
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64InstrInfo.td
llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll
llvm/test/CodeGen/AArch64/vecreduce-fadd.ll

Removed: 




diff  --git a/llvm/include/llvm/Target/TargetSelectionDAG.td 
b/llvm/include/llvm/Target/TargetSelectionDAG.td
index d5b8aeb1055d..0c6eef939ea4 100644
--- a/llvm/include/llvm/Target/TargetSelectionDAG.td
+++ b/llvm/include/llvm/Target/TargetSelectionDAG.td
@@ -250,6 +250,10 @@ def SDTVecInsert : SDTypeProfile<1, 3, [// vector 
insert
 def SDTVecReduce : SDTypeProfile<1, 1, [// vector reduction
   SDTCisInt<0>, SDTCisVec<1>
 ]>;
+def SDTFPVecReduce : SDTypeProfile<1, 1, [  // FP vector reduction
+  SDTCisFP<0>, SDTCisVec<1>
+]>;
+
 
 def SDTSubVecExtract : SDTypeProfile<1, 2, [// subvector extract
   SDTCisSubVecOfVec<0,1>, SDTCisInt<2>
@@ -439,6 +443,7 @@ def vecreduce_smax  : SDNode<"ISD::VECREDUCE_SMAX", 
SDTVecReduce>;
 def vecreduce_umax  : SDNode<"ISD::VECREDUCE_UMAX", SDTVecReduce>;
 def vecreduce_smin  : SDNode<"ISD::VECREDUCE_SMIN", SDTVecReduce>;
 def vecreduce_umin  : SDNode<"ISD::VECREDUCE_UMIN", SDTVecReduce>;
+def vecreduce_fadd  : SDNode<"ISD::VECREDUCE_FADD", SDTFPVecReduce>;
 
 def fadd   : SDNode<"ISD::FADD"   , SDTFPBinOp, [SDNPCommutative]>;
 def fsub   : SDNode<"ISD::FSUB"   , SDTFPBinOp>;

diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index faed7c64a15e..2b9dc84a06cc 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -999,6 +999,9 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine ,
 MVT::v8f16, MVT::v4f32, MVT::v2f64 }) {
   setOperationAction(ISD::VECREDUCE_FMAX, VT, Custom);
   setOperationAction(ISD::VECREDUCE_FMIN, VT, Custom);
+
+  if (VT.getVectorElementType() != MVT::f16 || Subtarget->hasFullFP16())
+setOperationAction(ISD::VECREDUCE_FADD, VT, Legal);
 }
 for (MVT VT : { MVT::v8i8, MVT::v4i16, MVT::v2i32,
 MVT::v16i8, MVT::v8i16, MVT::v4i32 }) {

diff  --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 4d70fb334828..7e9f2fb95188 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -4989,6 +4989,26 @@ defm FMAXNMP : SIMDFPPairwiseScalar<0, 0b01100, 
"fmaxnmp">;
 defm FMAXP   : SIMDFPPairwiseScalar<0, 0b0, "fmaxp">;
 defm FMINNMP : SIMDFPPairwiseScalar<1, 0b01100, "fminnmp">;
 defm FMINP   : SIMDFPPairwiseScalar<1, 0b0, "fminp">;
+
+let Predicates = [HasFullFP16] in {
+def : Pat<(f16 (vecreduce_fadd (v8f16 V128:$Rn))),
+(FADDPv2i16p
+  (EXTRACT_SUBREG
+ (FADDPv8f16 (FADDPv8f16 V128:$Rn, (v8f16 (IMPLICIT_DEF))), 
(v8f16 (IMPLICIT_DEF))),
+   dsub))>;
+def : Pat<(f16 (vecreduce_fadd (v4f16 V64:$Rn))),
+  (FADDPv2i16p (FADDPv4f16 V64:$Rn, (v4f16 (IMPLICIT_DEF>;
+}
+def : Pat<(f32 (vecreduce_fadd (v4f32 V128:$Rn))),
+  (FADDPv2i32p
+(EXTRACT_SUBREG
+  (FADDPv4f32 V128:$Rn, (v4f32 (IMPLICIT_DEF))),
+ dsub))>;
+def : Pat<(f32 (vecreduce_fadd (v2f32 V64:$Rn))),
+  (FADDPv2i32p V64:$Rn)>;
+def : Pat<(f64 (vecreduce_fadd (v2f64 V128:$Rn))),
+  (FADDPv2i64p V128:$Rn)>;
+
 def : Pat<(v2i64 (AArch64saddv V128:$Rn)),
   (INSERT_SUBREG (v2i64 (IMPLICIT_DEF)), (ADDPv2i64p V128:$Rn), dsub)>;
 def : Pat<(v2i64 (AArch64uaddv V128:$Rn)),

diff  --git a/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll 
b/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll
index 69b9c3e22d7a..2b5fcd4b839a 100644
--- a/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll
+++ b/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll
@@ -51,8 +51,7 @@ define float @test_v3f32(<3 x float> %a) nounwind {
 ; CHECK-NEXT:mov w8, #-2147483648
 ; CHECK-NEXT:fmov s1, w8
 ; CHECK-NEXT:mov v0.s[3], v1.s[0]
-; CHECK-NEXT:ext v1.16b, v0.16b, v0.16b, #8
-; CHECK-NEXT:fadd v0.2s, v0.2s, v1.2s
+; CHECK-NEXT:faddp v0.4s, v0.4s, v0.4s
 ; CHECK-NEXT:faddp s0, v0.2s
 ; CHECK-NEXT:ret
   %b = call reassoc float 

[llvm-branch-commits] [llvm] d568cff - [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF.

2020-12-09 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2020-12-09T11:25:21Z
New Revision: d568cff696e8fb89ce1b040561c037412767af60

URL: 
https://github.com/llvm/llvm-project/commit/d568cff696e8fb89ce1b040561c037412767af60
DIFF: 
https://github.com/llvm/llvm-project/commit/d568cff696e8fb89ce1b040561c037412767af60.diff

LOG: [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF.

* Steps are scaled by `vscale`, a runtime value.
* Changes to circumvent the cost-model for now (temporary)
  so that the cost-model can be implemented separately.

This can vectorize the following loop [1]:

   void loop(int N, double *a, double *b) {
 #pragma clang loop vectorize_width(4, scalable)
 for (int i = 0; i < N; i++) {
   a[i] = b[i] + 1.0;
 }
   }

[1] This source-level example is based on the pragma proposed
separately in D89031. This patch only implements the LLVM part.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91077

Added: 

llvm/test/Transforms/LoopVectorize/scalable-loop-unpredicated-body-scalar-tail.ll

Modified: 
llvm/include/llvm/IR/IRBuilder.h
llvm/lib/IR/IRBuilder.cpp
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/lib/Transforms/Vectorize/VPlan.h
llvm/test/Transforms/LoopVectorize/metadata-width.ll

Removed: 




diff  --git a/llvm/include/llvm/IR/IRBuilder.h 
b/llvm/include/llvm/IR/IRBuilder.h
index db215094a7e49..c2b3446d159f2 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -879,6 +879,10 @@ class IRBuilderBase {
  Type *ResultType,
  const Twine  = "");
 
+  /// Create a call to llvm.vscale, multiplied by \p Scaling. The type of 
VScale
+  /// will be the same type as that of \p Scaling.
+  Value *CreateVScale(Constant *Scaling, const Twine  = "");
+
   /// Create a call to intrinsic \p ID with 1 operand which is mangled on its
   /// type.
   CallInst *CreateUnaryIntrinsic(Intrinsic::ID ID, Value *V,

diff  --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index c0e4451f52003..f936f5756b6f0 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -80,6 +80,17 @@ static CallInst *createCallHelper(Function *Callee, 
ArrayRef Ops,
   return CI;
 }
 
+Value *IRBuilderBase::CreateVScale(Constant *Scaling, const Twine ) {
+  Module *M = GetInsertBlock()->getParent()->getParent();
+  assert(isa(Scaling) && "Expected constant integer");
+  Function *TheFn =
+  Intrinsic::getDeclaration(M, Intrinsic::vscale, {Scaling->getType()});
+  CallInst *CI = createCallHelper(TheFn, {}, this, Name);
+  return cast(Scaling)->getSExtValue() == 1
+ ? CI
+ : CreateMul(CI, Scaling);
+}
+
 CallInst *IRBuilderBase::CreateMemSet(Value *Ptr, Value *Val, Value *Size,
   MaybeAlign Align, bool isVolatile,
   MDNode *TBAATag, MDNode *ScopeTag,

diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index f504afd1ffc41..a91fb988badf6 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1121,6 +1121,15 @@ static OptimizationRemarkAnalysis createLVAnalysis(const 
char *PassName,
   return R;
 }
 
+/// Return a value for Step multiplied by VF.
+static Value *createStepForVF(IRBuilder<> , Constant *Step, ElementCount VF) 
{
+  assert(isa(Step) && "Expected an integer step");
+  Constant *StepVal = ConstantInt::get(
+  Step->getType(),
+  cast(Step)->getSExtValue() * VF.getKnownMinValue());
+  return VF.isScalable() ? B.CreateVScale(StepVal) : StepVal;
+}
+
 namespace llvm {
 
 void reportVectorizationFailure(const StringRef DebugMsg,
@@ -2277,8 +2286,6 @@ void InnerLoopVectorizer::buildScalarSteps(Value 
*ScalarIV, Value *Step,
const InductionDescriptor ) {
   // We shouldn't have to build scalar steps if we aren't vectorizing.
   assert(VF.isVector() && "VF should be greater than one");
-  assert(!VF.isScalable() &&
- "the code below assumes a fixed number of elements at compile time");
   // Get the value type and ensure it and the step have the same integer type.
   Type *ScalarIVTy = ScalarIV->getType()->getScalarType();
   assert(ScalarIVTy == Step->getType() &&
@@ -2303,11 +2310,24 @@ void InnerLoopVectorizer::buildScalarSteps(Value 
*ScalarIV, Value *Step,
   Cost->isUniformAfterVectorization(cast(EntryVal), VF)
   ? 1
   : VF.getKnownMinValue();
+  assert((!VF.isScalable() || Lanes == 1) &&
+ "Should never scalarize a scalable vector");
   // Compute the scalar steps and save the results in VectorLoopValueMap.
   for (unsigned Part = 0; Part < UF; ++Part) {
 for (unsigned Lane = 0; Lane < Lanes; ++Lane) {
-  auto *StartIdx = 

[llvm-branch-commits] [llvm] adc3714 - [LoopVectorizer] NFC: Remove unnecessary asserts that VF cannot be scalable.

2020-12-09 Thread Sander de Smalen via llvm-branch-commits

Author: Sander de Smalen
Date: 2020-12-09T11:25:21Z
New Revision: adc37145dec9cadf76af05326150ed22a3cc2fdd

URL: 
https://github.com/llvm/llvm-project/commit/adc37145dec9cadf76af05326150ed22a3cc2fdd
DIFF: 
https://github.com/llvm/llvm-project/commit/adc37145dec9cadf76af05326150ed22a3cc2fdd.diff

LOG: [LoopVectorizer] NFC: Remove unnecessary asserts that VF cannot be 
scalable.

This patch removes a number of asserts that VF is not scalable, even though
the code where this assert lives does nothing that prevents VF being scalable.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D91060

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6ba14e942ff8..f504afd1ffc4 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -367,7 +367,6 @@ static Type *getMemInstValueType(Value *I) {
 /// type is irregular if its allocated size doesn't equal the store size of an
 /// element of the corresponding vector type at the given vectorization factor.
 static bool hasIrregularType(Type *Ty, const DataLayout , ElementCount VF) {
-  assert(!VF.isScalable() && "scalable vectors not yet supported.");
   // Determine if an array of VF elements of type Ty is "bitcast compatible"
   // with a  vector.
   if (VF.isVector()) {
@@ -1387,9 +1386,7 @@ class LoopVectorizationCostModel {
   /// width \p VF. Return CM_Unknown if this instruction did not pass
   /// through the cost modeling.
   InstWidening getWideningDecision(Instruction *I, ElementCount VF) {
-assert(!VF.isScalable() && "scalable vectors not yet supported.");
-assert(VF.isVector() && "Expected VF >=2");
-
+assert(VF.isVector() && "Expected VF to be a vector VF");
 // Cost model is not run in the VPlan-native path - return conservative
 // result until this changes.
 if (EnableVPlanNativePath)
@@ -3902,8 +3899,10 @@ void InnerLoopVectorizer::fixVectorizedLoop() {
   // profile is not inherently precise anyway. Note also possible bypass of
   // vector code caused by legality checks is ignored, assigning all the weight
   // to the vector loop, optimistically.
-  assert(!VF.isScalable() &&
- "cannot use scalable ElementCount to determine unroll factor");
+  //
+  // For scalable vectorization we can't know at compile time how many 
iterations
+  // of the loop are handled in one vector iteration, so instead assume a 
pessimistic
+  // vscale of '1'.
   setProfileInfoAfterUnrolling(
   LI->getLoopFor(LoopScalarBody), LI->getLoopFor(LoopVectorBody),
   LI->getLoopFor(LoopScalarBody), VF.getKnownMinValue() * UF);
@@ -4709,7 +4708,6 @@ static bool mayDivideByZero(Instruction ) {
 void InnerLoopVectorizer::widenInstruction(Instruction , VPValue *Def,
VPUser ,
VPTransformState ) {
-  assert(!VF.isScalable() && "scalable vectors not yet supported.");
   switch (I.getOpcode()) {
   case Instruction::Call:
   case Instruction::Br:
@@ -4797,7 +4795,6 @@ void InnerLoopVectorizer::widenInstruction(Instruction 
, VPValue *Def,
 setDebugLocFromInst(Builder, CI);
 
 /// Vectorize casts.
-assert(!VF.isScalable() && "VF is assumed to be non scalable.");
 Type *DestTy =
 (VF.isScalar()) ? CI->getType() : VectorType::get(CI->getType(), VF);
 
@@ -5099,7 +5096,6 @@ void 
LoopVectorizationCostModel::collectLoopScalars(ElementCount VF) {
 
 bool LoopVectorizationCostModel::isScalarWithPredication(Instruction *I,
  ElementCount VF) {
-  assert(!VF.isScalable() && "scalable vectors not yet supported.");
   if (!blockNeedsPredication(I->getParent()))
 return false;
   switch(I->getOpcode()) {
@@ -6420,7 +6416,6 @@ int LoopVectorizationCostModel::computePredInstDiscount(
 
 LoopVectorizationCostModel::VectorizationCostTy
 LoopVectorizationCostModel::expectedCost(ElementCount VF) {
-  assert(!VF.isScalable() && "scalable vectors not yet supported.");
   VectorizationCostTy Cost;
 
   // For each block.
@@ -7935,7 +7930,6 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, VFRange 
,
  "Must be called with either a load or store");
 
   auto willWiden = [&](ElementCount VF) -> bool {
-assert(!VF.isScalable() && "unexpected scalable ElementCount");
 if (VF.isScalar())
   return false;
 LoopVectorizationCostModel::InstWidening Decision =



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits