[llvm-branch-commits] [clang] release/19.x: [Clang] Demote always_inline error to warning for mismatching SME attrs (#100740) (PR #100987)
https://github.com/sdesmalen-arm requested changes to this pull request. For some odd reason, `clang/test/CodeGen/aarch64-sme-inline-streaming-attrs.c` seems to be failing on some buildbots with an error that says: > `unable to create target: No available targets are compatible with triple > "aarch64-none-linux-gnu"'`. I suspect because the test is missing a > `REQUIRES: aarch64-registered-target`, but I'm puzzled why that test didn't > fail before, because my patch doesn't introduce this test and doesn't change > the RUN lines, all this patch does is change one of the diagnostic messages. In any case, I seemed to have jumped the gun creating this cherry-pick in the first place, I thought the change was trivial enough to do so especially after testing locally. My apologies for the noise. I'll revert the patch and fix the issue, and will then create another cherry-pick. https://github.com/llvm/llvm-project/pull/100987 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_get_current_vg to preserve required registers (#100143) (PR #100546)
sdesmalen-arm wrote: It would be great if we could merge this fix into the release branch! https://github.com/llvm/llvm-project/pull/100546 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] Improve cost model for legal subvec insert/extract (PR #81135)
@@ -568,6 +568,48 @@ AArch64TTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes , } return Cost; } + case Intrinsic::vector_extract: { +// If both the vector argument and the return type are legal types and the +// index is 0, then this should be a no-op or simple operation; return a +// relatively low cost. + +// If arguments aren't actually supplied, then we cannot determine the +// value of the index. +if (ICA.getArgs().size() < 2) + break; +LLVMContext = RetTy->getContext(); +EVT MRTy = getTLI()->getValueType(DL, RetTy); +EVT MPTy = getTLI()->getValueType(DL, ICA.getArgTypes()[0]); +TargetLoweringBase::LegalizeKind RLK = getTLI()->getTypeConversion(C, MRTy); +TargetLoweringBase::LegalizeKind PLK = getTLI()->getTypeConversion(C, MPTy); +const ConstantInt *Idx = dyn_cast(ICA.getArgs()[1]); +if (RLK.first == TargetLoweringBase::TypeLegal && +PLK.first == TargetLoweringBase::TypeLegal && Idx && +Idx->getZExtValue() == 0) + return InstructionCost(1); sdesmalen-arm wrote: Is there a reason this wouldn't this be zero-cost? Also, stylistically to match the rest of this file, maybe return `TTI::TCC_Free` (if this is considered a cost of 0) or `TTI::TCC_Basic` (if this is considered a cost of 1) instead? https://github.com/llvm/llvm-project/pull/81135 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] Improve cost model for legal subvec insert/extract (PR #81135)
@@ -568,6 +568,48 @@ AArch64TTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes , } return Cost; } + case Intrinsic::vector_extract: { +// If both the vector argument and the return type are legal types and the +// index is 0, then this should be a no-op or simple operation; return a +// relatively low cost. + +// If arguments aren't actually supplied, then we cannot determine the +// value of the index. +if (ICA.getArgs().size() < 2) sdesmalen-arm wrote: nit: ```suggestion if (ICA.getArgs().size() != 2) ``` https://github.com/llvm/llvm-project/pull/81135 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] Improve cost model for legal subvec insert/extract (PR #81135)
@@ -568,6 +568,32 @@ AArch64TTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes , } return Cost; } + case Intrinsic::vector_extract: { +// If both the vector argument and the return type are legal types, then +// this should be a no-op or simple operation; return a relatively low cost. +LLVMContext = RetTy->getContext(); +EVT MRTy = getTLI()->getValueType(DL, RetTy); +EVT MPTy = getTLI()->getValueType(DL, ICA.getArgTypes()[0]); +TargetLoweringBase::LegalizeKind RLK = getTLI()->getTypeConversion(C, MRTy); +TargetLoweringBase::LegalizeKind PLK = getTLI()->getTypeConversion(C, MPTy); +if (RLK.first == TargetLoweringBase::TypeLegal && +PLK.first == TargetLoweringBase::TypeLegal) + return InstructionCost(1); sdesmalen-arm wrote: Just pointing out that the code isn't updated yet to handle predicates differently, as those inserts/extracts are indeed not free. https://github.com/llvm/llvm-project/pull/81135 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/18.x: [AArch64][SME] Implement inline-asm clobbers for za/zt0 (#79276) (PR #81616)
https://github.com/sdesmalen-arm approved this pull request. Looks pretty low-risk to me and would be nice to get into the release if we can. (how is this PR different from #81593?) https://github.com/llvm/llvm-project/pull/81616 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/18.x: [AArch64][SME] Implement inline-asm clobbers for za/zt0 (#79276) (PR #81593)
https://github.com/sdesmalen-arm approved this pull request. Looks pretty low-risk to me and would be nice to get into the release if we can. https://github.com/llvm/llvm-project/pull/81593 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 329fda3 - NFC: Mention auto-vec support for SVE in release notes.
Author: Sander de Smalen Date: 2022-03-14T09:44:55Z New Revision: 329fda39c507e8740978d10458451dcdb21563be URL: https://github.com/llvm/llvm-project/commit/329fda39c507e8740978d10458451dcdb21563be DIFF: https://github.com/llvm/llvm-project/commit/329fda39c507e8740978d10458451dcdb21563be.diff LOG: NFC: Mention auto-vec support for SVE in release notes. Added: Modified: llvm/docs/ReleaseNotes.rst Removed: diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst index e8934f79181a7..b2d8d8c2640e2 100644 --- a/llvm/docs/ReleaseNotes.rst +++ b/llvm/docs/ReleaseNotes.rst @@ -82,6 +82,7 @@ Changes to the AArch64 Backend "tune-cpu" attribute is absent it tunes according to the "target-cpu". * Fixed relocations against temporary symbols (e.g. in jump tables and constant pools) in large COFF object files. +* Auto-vectorization now targets SVE by default when available. Changes to the ARM Backend -- ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 171d124 - [SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost.
Author: Sander de Smalen Date: 2021-01-25T12:27:01Z New Revision: 171d12489f20818e292362342b5665c689073ad2 URL: https://github.com/llvm/llvm-project/commit/171d12489f20818e292362342b5665c689073ad2 DIFF: https://github.com/llvm/llvm-project/commit/171d12489f20818e292362342b5665c689073ad2.diff LOG: [SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost. This change also changes getReductionCost to return InstructionCost, and it simplifies two expressions by removing a redundant 'isValid' check. Added: Modified: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp Removed: diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp index 7114b4d412fd..0b630197911a 100644 --- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp +++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp @@ -3411,21 +3411,21 @@ bool BoUpSLP::areAllUsersVectorized(Instruction *I) const { }); } -static std::pair +static std::pair getVectorCallCosts(CallInst *CI, FixedVectorType *VecTy, TargetTransformInfo *TTI, TargetLibraryInfo *TLI) { Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI); // Calculate the cost of the scalar and vector calls. IntrinsicCostAttributes CostAttrs(ID, *CI, VecTy->getElementCount()); - int IntrinsicCost = + auto IntrinsicCost = TTI->getIntrinsicInstrCost(CostAttrs, TTI::TCK_RecipThroughput); auto Shape = VFShape::get(*CI, ElementCount::getFixed(static_cast( VecTy->getNumElements())), false /*HasGlobalPred*/); Function *VecFunc = VFDatabase(*CI).getVectorizedFunction(Shape); - int LibCost = IntrinsicCost; + auto LibCost = IntrinsicCost; if (!CI->isNoBuiltin() && VecFunc) { // Calculate the cost of the vector library call. SmallVector VecTys; @@ -5994,7 +5994,7 @@ bool SLPVectorizerPass::vectorizeStoreChain(ArrayRef Chain, BoUpSLP , InstructionCost Cost = R.getTreeCost(); LLVM_DEBUG(dbgs() << "SLP: Found cost = " << Cost << " for VF =" << VF << "\n"); - if (Cost.isValid() && Cost < -SLPCostThreshold) { + if (Cost < -SLPCostThreshold) { LLVM_DEBUG(dbgs() << "SLP: Decided to vectorize cost = " << Cost << "\n"); using namespace ore; @@ -6295,7 +6295,7 @@ bool SLPVectorizerPass::tryToVectorizeList(ArrayRef VL, BoUpSLP , MinCost = std::min(MinCost, Cost); - if (Cost.isValid() && Cost < -SLPCostThreshold) { + if (Cost < -SLPCostThreshold) { LLVM_DEBUG(dbgs() << "SLP: Vectorizing list at cost:" << Cost << ".\n"); R.getORE()->emit(OptimizationRemark(SV_NAME, "VectorizedList", cast(Ops[0])) @@ -7007,11 +7007,12 @@ class HorizontalReduction { private: /// Calculate the cost of a reduction. - int getReductionCost(TargetTransformInfo *TTI, Value *FirstReducedVal, - unsigned ReduxWidth) { + InstructionCost getReductionCost(TargetTransformInfo *TTI, + Value *FirstReducedVal, + unsigned ReduxWidth) { Type *ScalarTy = FirstReducedVal->getType(); FixedVectorType *VectorTy = FixedVectorType::get(ScalarTy, ReduxWidth); -int VectorCost, ScalarCost; +InstructionCost VectorCost, ScalarCost; switch (RdxKind) { case RecurKind::Add: case RecurKind::Mul: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] d196f9e - [InstructionCost] Prevent InstructionCost being created with CostState.
Author: Sander de Smalen Date: 2021-01-25T11:26:56Z New Revision: d196f9e2fca3ff767aa7d2dcaf4654724a79e18c URL: https://github.com/llvm/llvm-project/commit/d196f9e2fca3ff767aa7d2dcaf4654724a79e18c DIFF: https://github.com/llvm/llvm-project/commit/d196f9e2fca3ff767aa7d2dcaf4654724a79e18c.diff LOG: [InstructionCost] Prevent InstructionCost being created with CostState. For a function that returns InstructionCost, it is very tempting to write: return InstructionCost::Invalid; But that actually returns InstructionCost(1 /* int value of Invalid */)) which has a totally different meaning. By marking this constructor as `delete`, this can no longer happen. Added: Modified: llvm/include/llvm/Support/InstructionCost.h Removed: diff --git a/llvm/include/llvm/Support/InstructionCost.h b/llvm/include/llvm/Support/InstructionCost.h index 725f8495ac09..fbc898b878bb 100644 --- a/llvm/include/llvm/Support/InstructionCost.h +++ b/llvm/include/llvm/Support/InstructionCost.h @@ -47,6 +47,7 @@ class InstructionCost { public: InstructionCost() = default; + InstructionCost(CostState) = delete; InstructionCost(CostType Val) : Value(Val), State(Valid) {} static InstructionCost getInvalid(CostType Val = 0) { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] c8a914d - [LiveDebugValues] Fix comparison operator in VarLocBasedImpl
Author: Sander de Smalen Date: 2021-01-12T08:44:58Z New Revision: c8a914db5c60dbeb5b638f30a9915855a67805f7 URL: https://github.com/llvm/llvm-project/commit/c8a914db5c60dbeb5b638f30a9915855a67805f7 DIFF: https://github.com/llvm/llvm-project/commit/c8a914db5c60dbeb5b638f30a9915855a67805f7.diff LOG: [LiveDebugValues] Fix comparison operator in VarLocBasedImpl The issue was introduced in commit rG84a1120943a651184bae507fed5d648fee381ae4 and would cause a VarLoc's StackOffset to be compared with its own, instead of the StackOffset from the other VarLoc. This patch fixes that. Added: Modified: llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp Removed: diff --git a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp index 4811b8046797..e2daa46fe6b9 100644 --- a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp +++ b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp @@ -572,8 +572,9 @@ class VarLocBasedLDV : public LDVImpl { Expr) < std::make_tuple( Other.Var, Other.Kind, Other.Loc.SpillLocation.SpillBase, - Loc.SpillLocation.SpillOffset.getFixed(), - Loc.SpillLocation.SpillOffset.getScalable(), Other.Expr); + Other.Loc.SpillLocation.SpillOffset.getFixed(), + Other.Loc.SpillLocation.SpillOffset.getScalable(), + Other.Expr); case RegisterKind: case ImmediateKind: case EntryValueKind: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] aa280c9 - [AArch64][SVE] Emit DWARF location expr for SVE (dbg.declare)
Author: Sander de Smalen Date: 2021-01-06T11:45:05Z New Revision: aa280c99f708dca9dea96bc9070d6194d2622529 URL: https://github.com/llvm/llvm-project/commit/aa280c99f708dca9dea96bc9070d6194d2622529 DIFF: https://github.com/llvm/llvm-project/commit/aa280c99f708dca9dea96bc9070d6194d2622529.diff LOG: [AArch64][SVE] Emit DWARF location expr for SVE (dbg.declare) When using dbg.declare, the debug-info is generated from a list of locals rather than through DBG_VALUE instructions in the MIR. This patch is different from D90020 because it emits the DWARF location expressions from that list of locals directly. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D90044 Added: llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir Modified: llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp Removed: diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp index 02791f2280d2..ea279e4914b0 100644 --- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp +++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp @@ -739,11 +739,10 @@ DIE *DwarfCompileUnit::constructVariableDIEImpl(const DbgVariable , TFI->getFrameIndexReference(*Asm->MF, Fragment.FI, FrameReg); DwarfExpr.addFragmentOffset(Expr); -assert(!Offset.getScalable() && - "Frame offsets with a scalable component are not supported"); - +auto *TRI = Asm->MF->getSubtarget().getRegisterInfo(); SmallVector Ops; -DIExpression::appendOffset(Ops, Offset.getFixed()); +TRI->getOffsetOpcodes(Offset, Ops); + // According to // https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf // cuda-gdb requires DW_AT_address_class for all variables to be able to diff --git a/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir new file mode 100644 index ..39b11ef7bfea --- /dev/null +++ b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-declare.mir @@ -0,0 +1,222 @@ +# RUN: llc -o %t -filetype=obj -start-before=prologepilog %s +# RUN: llvm-dwarfdump --name="z0" %t | FileCheck %s --check-prefix=CHECKZ0 +# RUN: llvm-dwarfdump --name="z1" %t | FileCheck %s --check-prefix=CHECKZ1 +# RUN: llvm-dwarfdump --name="p0" %t | FileCheck %s --check-prefix=CHECKP0 +# RUN: llvm-dwarfdump --name="p1" %t | FileCheck %s --check-prefix=CHECKP1 +# RUN: llvm-dwarfdump --name="localv0" %t | FileCheck %s --check-prefix=CHECKLV0 +# RUN: llvm-dwarfdump --name="localv1" %t | FileCheck %s --check-prefix=CHECKLV1 +# RUN: llvm-dwarfdump --name="localp0" %t | FileCheck %s --check-prefix=CHECKLP0 +# RUN: llvm-dwarfdump --name="localp1" %t | FileCheck %s --check-prefix=CHECKLP1 +# +# CHECKZ0: DW_AT_location(DW_OP_fbreg +0, DW_OP_lit8, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKZ0-NEXT: DW_AT_name("z0") +# CHECKZ1: DW_AT_location(DW_OP_fbreg +0, DW_OP_lit16, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKZ1-NEXT: DW_AT_name("z1") +# CHECKP0: DW_AT_location(DW_OP_fbreg +0, DW_OP_lit17, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKP0-NEXT: DW_AT_name("p0") +# CHECKP1: DW_AT_location(DW_OP_fbreg +0, DW_OP_lit18, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKP1-NEXT: DW_AT_name("p1") +# CHECKLV0: DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x20, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKLV0-NEXT: DW_AT_name("localv0") +# CHECKLV1: DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x28, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKLV1-NEXT: DW_AT_name("localv1") +# CHECKLP0: DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x29, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKLP0-NEXT: DW_AT_name("localp0") +# CHECKLP1: DW_AT_location(DW_OP_fbreg +0, DW_OP_constu 0x2a, DW_OP_bregx VG+0, DW_OP_mul, DW_OP_minus) +# CHECKLP1-NEXT: DW_AT_name("localp1") +--- | + ; ModuleID = 't.c' + source_filename = "t.c" + target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" + target triple = "aarch64-unknown-linux-gnu" + + ; Function Attrs: noinline nounwind optnone + define dso_local @foo( %z0, %z1, %p0, %p1, i32 %w0) #0 !dbg !11 { + entry: +%z0.addr = alloca , align 16 +%z1.addr = alloca , align 16 +%p0.addr = alloca , align 2 +%p1.addr = alloca , align 2 +%w0.addr = alloca i32, align 4 +%local_gpr0 = alloca i32, align 4 +%localv0 = alloca , align 16 +%localv1 = alloca , align 16 +%localp0 = alloca , align 2 +%localp1 = alloca , align 2 +store %z0, * %z0.addr, align 16 +call void @llvm.dbg.declare(metadata * %z0.addr, metadata !29, metadata !DIExpression()), !dbg !30 +store %z1, * %z1.addr, align 16 +call
[llvm-branch-commits] [llvm] 84a1120 - [LiveDebugValues] Handle spill locations with a fixed and scalable component.
Author: Sander de Smalen Date: 2021-01-06T11:30:13Z New Revision: 84a1120943a651184bae507fed5d648fee381ae4 URL: https://github.com/llvm/llvm-project/commit/84a1120943a651184bae507fed5d648fee381ae4 DIFF: https://github.com/llvm/llvm-project/commit/84a1120943a651184bae507fed5d648fee381ae4.diff LOG: [LiveDebugValues] Handle spill locations with a fixed and scalable component. This patch fixes the two LiveDebugValues implementations (InstrRef/VarLoc)Based to handle cases where the StackOffset contains both a fixed and scalable component. This depends on the `TargetRegisterInfo::prependOffsetExpression` being added in D90020. Feel free to leave comments on that patch if you have them. Reviewed By: djtodoro, jmorse Differential Revision: https://reviews.llvm.org/D90046 Added: llvm/test/CodeGen/AArch64/live-debugvalues-sve.mir Modified: llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp Removed: diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp index 04ead18cc3de2..b6f46daf8bba9 100644 --- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp +++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp @@ -182,6 +182,7 @@ #include "llvm/Support/Casting.h" #include "llvm/Support/Compiler.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/TypeSize.h" #include "llvm/Support/raw_ostream.h" #include #include @@ -221,14 +222,16 @@ namespace { // an offset. struct SpillLoc { unsigned SpillBase; - int SpillOffset; + StackOffset SpillOffset; bool operator==(const SpillLoc ) const { -return std::tie(SpillBase, SpillOffset) == - std::tie(Other.SpillBase, Other.SpillOffset); +return std::make_pair(SpillBase, SpillOffset) == + std::make_pair(Other.SpillBase, Other.SpillOffset); } bool operator<(const SpillLoc ) const { -return std::tie(SpillBase, SpillOffset) < - std::tie(Other.SpillBase, Other.SpillOffset); +return std::make_tuple(SpillBase, SpillOffset.getFixed(), +SpillOffset.getScalable()) < + std::make_tuple(Other.SpillBase, Other.SpillOffset.getFixed(), +Other.SpillOffset.getScalable()); } }; @@ -769,8 +772,10 @@ class MLocTracker { } else if (LocIdxToLocID[*MLoc] >= NumRegs) { unsigned LocID = LocIdxToLocID[*MLoc]; const SpillLoc = SpillLocs[LocID - NumRegs + 1]; - Expr = DIExpression::prepend(Expr, DIExpression::ApplyOffset, - Spill.SpillOffset); + + auto *TRI = MF.getSubtarget().getRegisterInfo(); + Expr = TRI->prependOffsetExpression(Expr, DIExpression::ApplyOffset, + Spill.SpillOffset); unsigned Base = Spill.SpillBase; MIB.addReg(Base, RegState::Debug); MIB.addImm(0); @@ -1579,9 +1584,7 @@ InstrRefBasedLDV::extractSpillBaseRegAndOffset(const MachineInstr ) { const MachineBasicBlock *MBB = MI.getParent(); Register Reg; StackOffset Offset = TFI->getFrameIndexReference(*MBB->getParent(), FI, Reg); - assert(!Offset.getScalable() && - "Frame offsets with a scalable component are not supported"); - return {Reg, static_cast(Offset.getFixed())}; + return {Reg, Offset}; } /// End all previous ranges related to @MI and start a new range from @MI diff --git a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp index ed7f04e571acc..4811b80467973 100644 --- a/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp +++ b/llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp @@ -145,6 +145,7 @@ #include "llvm/Support/Casting.h" #include "llvm/Support/Compiler.h" #include "llvm/Support/Debug.h" +#include "llvm/Support/TypeSize.h" #include "llvm/Support/raw_ostream.h" #include "llvm/Target/TargetMachine.h" #include @@ -292,7 +293,7 @@ class VarLocBasedLDV : public LDVImpl { // register and an offset. struct SpillLoc { unsigned SpillBase; - int SpillOffset; + StackOffset SpillOffset; bool operator==(const SpillLoc ) const { return SpillBase == Other.SpillBase && SpillOffset == Other.SpillOffset; } @@ -323,21 +324,20 @@ class VarLocBasedLDV : public LDVImpl { /// The value location. Stored separately to avoid repeatedly /// extracting it from MI. -union { +union LocUnion { uint64_t RegNo; SpillLoc SpillLocation; uint64_t Hash; int64_t Immediate; const ConstantFP *FPImm; const ConstantInt *CImm; + LocUnion() : Hash(0) {} } Loc; VarLoc(const MachineInstr , LexicalScopes ) : Var(MI.getDebugVariable(), MI.getDebugExpression(), MI.getDebugLoc()->getInlinedAt()),
[llvm-branch-commits] [llvm] e4cda13 - Fix test failure in a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf
Author: Sander de Smalen Date: 2021-01-06T10:43:48Z New Revision: e4cda13d5a54a8c6366e4ca82d74265e68bbb3f5 URL: https://github.com/llvm/llvm-project/commit/e4cda13d5a54a8c6366e4ca82d74265e68bbb3f5 DIFF: https://github.com/llvm/llvm-project/commit/e4cda13d5a54a8c6366e4ca82d74265e68bbb3f5.diff LOG: Fix test failure in a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf Set the target-triple to aarch64 in debug-info-sve-dbg-value.mir to avoid "'+sve' is not a recognized feature for this target" diagnostic. Added: Modified: llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir Removed: diff --git a/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir index ffce40c9c4f4..84d34ce3d2ac 100644 --- a/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir +++ b/llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir @@ -27,6 +27,7 @@ --- | ; ModuleID = 'bla.mir' source_filename = "bla.mir" + target triple = "aarch64-unknown-linux-gnu" target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" define void @foo() #0 !dbg !5 { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] a7e3339 - [AArch64][SVE] Emit DWARF location expression for SVE stack objects.
Author: Sander de Smalen Date: 2021-01-06T09:40:53Z New Revision: a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf URL: https://github.com/llvm/llvm-project/commit/a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf DIFF: https://github.com/llvm/llvm-project/commit/a7e3339f3b0eb71e43d44e6f59cc8db6a7b110bf.diff LOG: [AArch64][SVE] Emit DWARF location expression for SVE stack objects. Extend PEI to emit a DWARF expression for StackOffsets that have a fixed and scalable component. This means the expression that needs to be added is either: + offset or: + offset + scalable_offset * scalereg where for SVE, the scale reg is the Vector Granule Dwarf register, which encodes the number of 64bit 'granules' in an SVE vector and which the debugger can evaluate at runtime. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D90020 Added: llvm/test/CodeGen/AArch64/debug-info-sve-dbg-value.mir Modified: llvm/include/llvm/CodeGen/TargetRegisterInfo.h llvm/lib/CodeGen/PrologEpilogInserter.cpp llvm/lib/CodeGen/TargetRegisterInfo.cpp llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp llvm/lib/Target/AArch64/AArch64RegisterInfo.h Removed: diff --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h index de2c1b069784..6f32729a1e83 100644 --- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h +++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h @@ -34,6 +34,7 @@ namespace llvm { class BitVector; +class DIExpression; class LiveRegMatrix; class MachineFunction; class MachineInstr; @@ -923,6 +924,15 @@ class TargetRegisterInfo : public MCRegisterInfo { llvm_unreachable("isFrameOffsetLegal does not exist on this target"); } + /// Gets the DWARF expression opcodes for \p Offset. + virtual void getOffsetOpcodes(const StackOffset , +SmallVectorImpl ) const; + + /// Prepends a DWARF expression for \p Offset to DIExpression \p Expr. + DIExpression * + prependOffsetExpression(const DIExpression *Expr, unsigned PrependFlags, + const StackOffset ) const; + /// Spill the register so it can be used by the register scavenger. /// Return true if the register was spilled, false otherwise. /// If this function does not spill the register, the scavenger diff --git a/llvm/lib/CodeGen/PrologEpilogInserter.cpp b/llvm/lib/CodeGen/PrologEpilogInserter.cpp index 7c38b193a980..65b2165bf2a0 100644 --- a/llvm/lib/CodeGen/PrologEpilogInserter.cpp +++ b/llvm/lib/CodeGen/PrologEpilogInserter.cpp @@ -1211,8 +1211,6 @@ void PEI::replaceFrameIndices(MachineBasicBlock *BB, MachineFunction , StackOffset Offset = TFI->getFrameIndexReference(MF, FrameIdx, Reg); -assert(!Offset.getScalable() && - "Frame offsets with a scalable component are not supported"); MI.getOperand(0).ChangeToRegister(Reg, false /*isDef*/); MI.getOperand(0).setIsDebug(); @@ -1238,7 +1236,8 @@ void PEI::replaceFrameIndices(MachineBasicBlock *BB, MachineFunction , // Make the DBG_VALUE direct. MI.getDebugOffset().ChangeToRegister(0, false); } -DIExpr = DIExpression::prepend(DIExpr, PrependFlags, Offset.getFixed()); + +DIExpr = TRI.prependOffsetExpression(DIExpr, PrependFlags, Offset); MI.getDebugExpressionOp().setMetadata(DIExpr); continue; } diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp index e89353c9ad27..4a190c9f50af 100644 --- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp +++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp @@ -26,6 +26,7 @@ #include "llvm/CodeGen/VirtRegMap.h" #include "llvm/Config/llvm-config.h" #include "llvm/IR/Attributes.h" +#include "llvm/IR/DebugInfoMetadata.h" #include "llvm/IR/Function.h" #include "llvm/MC/MCRegisterInfo.h" #include "llvm/Support/CommandLine.h" @@ -532,6 +533,31 @@ TargetRegisterInfo::lookThruCopyLike(Register SrcReg, } } +void TargetRegisterInfo::getOffsetOpcodes( +const StackOffset , SmallVectorImpl ) const { + assert(!Offset.getScalable() && "Scalable offsets are not handled"); + DIExpression::appendOffset(Ops, Offset.getFixed()); +} + +DIExpression * +TargetRegisterInfo::prependOffsetExpression(const DIExpression *Expr, +unsigned PrependFlags, +const StackOffset ) const { + assert((PrependFlags & + ~(DIExpression::DerefBefore | DIExpression::DerefAfter | +DIExpression::StackValue | DIExpression::EntryValue)) == 0 && + "Unsupported prepend flag"); + SmallVector OffsetExpr; + if (PrependFlags & DIExpression::DerefBefore) +OffsetExpr.push_back(dwarf::DW_OP_deref); + getOffsetOpcodes(Offset, OffsetExpr); + if (PrependFlags & DIExpression::DerefAfter) +
[llvm-branch-commits] [llvm] a9f5e43 - [AArch64] Use faddp to implement fadd reductions.
Author: Sander de Smalen Date: 2021-01-06T09:36:51Z New Revision: a9f5e4375b36e5316b8d6f9731be6bfa5a70e276 URL: https://github.com/llvm/llvm-project/commit/a9f5e4375b36e5316b8d6f9731be6bfa5a70e276 DIFF: https://github.com/llvm/llvm-project/commit/a9f5e4375b36e5316b8d6f9731be6bfa5a70e276.diff LOG: [AArch64] Use faddp to implement fadd reductions. Custom-expand legal VECREDUCE_FADD SDNodes to benefit from pair-wise faddp instructions. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D59259 Added: Modified: llvm/include/llvm/Target/TargetSelectionDAG.td llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.td llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll llvm/test/CodeGen/AArch64/vecreduce-fadd.ll Removed: diff --git a/llvm/include/llvm/Target/TargetSelectionDAG.td b/llvm/include/llvm/Target/TargetSelectionDAG.td index d5b8aeb1055d..0c6eef939ea4 100644 --- a/llvm/include/llvm/Target/TargetSelectionDAG.td +++ b/llvm/include/llvm/Target/TargetSelectionDAG.td @@ -250,6 +250,10 @@ def SDTVecInsert : SDTypeProfile<1, 3, [// vector insert def SDTVecReduce : SDTypeProfile<1, 1, [// vector reduction SDTCisInt<0>, SDTCisVec<1> ]>; +def SDTFPVecReduce : SDTypeProfile<1, 1, [ // FP vector reduction + SDTCisFP<0>, SDTCisVec<1> +]>; + def SDTSubVecExtract : SDTypeProfile<1, 2, [// subvector extract SDTCisSubVecOfVec<0,1>, SDTCisInt<2> @@ -439,6 +443,7 @@ def vecreduce_smax : SDNode<"ISD::VECREDUCE_SMAX", SDTVecReduce>; def vecreduce_umax : SDNode<"ISD::VECREDUCE_UMAX", SDTVecReduce>; def vecreduce_smin : SDNode<"ISD::VECREDUCE_SMIN", SDTVecReduce>; def vecreduce_umin : SDNode<"ISD::VECREDUCE_UMIN", SDTVecReduce>; +def vecreduce_fadd : SDNode<"ISD::VECREDUCE_FADD", SDTFPVecReduce>; def fadd : SDNode<"ISD::FADD" , SDTFPBinOp, [SDNPCommutative]>; def fsub : SDNode<"ISD::FSUB" , SDTFPBinOp>; diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp index faed7c64a15e..2b9dc84a06cc 100644 --- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp @@ -999,6 +999,9 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine , MVT::v8f16, MVT::v4f32, MVT::v2f64 }) { setOperationAction(ISD::VECREDUCE_FMAX, VT, Custom); setOperationAction(ISD::VECREDUCE_FMIN, VT, Custom); + + if (VT.getVectorElementType() != MVT::f16 || Subtarget->hasFullFP16()) +setOperationAction(ISD::VECREDUCE_FADD, VT, Legal); } for (MVT VT : { MVT::v8i8, MVT::v4i16, MVT::v2i32, MVT::v16i8, MVT::v8i16, MVT::v4i32 }) { diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td index 4d70fb334828..7e9f2fb95188 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td @@ -4989,6 +4989,26 @@ defm FMAXNMP : SIMDFPPairwiseScalar<0, 0b01100, "fmaxnmp">; defm FMAXP : SIMDFPPairwiseScalar<0, 0b0, "fmaxp">; defm FMINNMP : SIMDFPPairwiseScalar<1, 0b01100, "fminnmp">; defm FMINP : SIMDFPPairwiseScalar<1, 0b0, "fminp">; + +let Predicates = [HasFullFP16] in { +def : Pat<(f16 (vecreduce_fadd (v8f16 V128:$Rn))), +(FADDPv2i16p + (EXTRACT_SUBREG + (FADDPv8f16 (FADDPv8f16 V128:$Rn, (v8f16 (IMPLICIT_DEF))), (v8f16 (IMPLICIT_DEF))), + dsub))>; +def : Pat<(f16 (vecreduce_fadd (v4f16 V64:$Rn))), + (FADDPv2i16p (FADDPv4f16 V64:$Rn, (v4f16 (IMPLICIT_DEF>; +} +def : Pat<(f32 (vecreduce_fadd (v4f32 V128:$Rn))), + (FADDPv2i32p +(EXTRACT_SUBREG + (FADDPv4f32 V128:$Rn, (v4f32 (IMPLICIT_DEF))), + dsub))>; +def : Pat<(f32 (vecreduce_fadd (v2f32 V64:$Rn))), + (FADDPv2i32p V64:$Rn)>; +def : Pat<(f64 (vecreduce_fadd (v2f64 V128:$Rn))), + (FADDPv2i64p V128:$Rn)>; + def : Pat<(v2i64 (AArch64saddv V128:$Rn)), (INSERT_SUBREG (v2i64 (IMPLICIT_DEF)), (ADDPv2i64p V128:$Rn), dsub)>; def : Pat<(v2i64 (AArch64uaddv V128:$Rn)), diff --git a/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll b/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll index 69b9c3e22d7a..2b5fcd4b839a 100644 --- a/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll +++ b/llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll @@ -51,8 +51,7 @@ define float @test_v3f32(<3 x float> %a) nounwind { ; CHECK-NEXT:mov w8, #-2147483648 ; CHECK-NEXT:fmov s1, w8 ; CHECK-NEXT:mov v0.s[3], v1.s[0] -; CHECK-NEXT:ext v1.16b, v0.16b, v0.16b, #8 -; CHECK-NEXT:fadd v0.2s, v0.2s, v1.2s +; CHECK-NEXT:faddp v0.4s, v0.4s, v0.4s ; CHECK-NEXT:faddp s0, v0.2s ; CHECK-NEXT:ret %b = call reassoc float
[llvm-branch-commits] [llvm] d568cff - [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF.
Author: Sander de Smalen Date: 2020-12-09T11:25:21Z New Revision: d568cff696e8fb89ce1b040561c037412767af60 URL: https://github.com/llvm/llvm-project/commit/d568cff696e8fb89ce1b040561c037412767af60 DIFF: https://github.com/llvm/llvm-project/commit/d568cff696e8fb89ce1b040561c037412767af60.diff LOG: [LoopVectorizer][SVE] Vectorize a simple loop with with a scalable VF. * Steps are scaled by `vscale`, a runtime value. * Changes to circumvent the cost-model for now (temporary) so that the cost-model can be implemented separately. This can vectorize the following loop [1]: void loop(int N, double *a, double *b) { #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < N; i++) { a[i] = b[i] + 1.0; } } [1] This source-level example is based on the pragma proposed separately in D89031. This patch only implements the LLVM part. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91077 Added: llvm/test/Transforms/LoopVectorize/scalable-loop-unpredicated-body-scalar-tail.ll Modified: llvm/include/llvm/IR/IRBuilder.h llvm/lib/IR/IRBuilder.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/test/Transforms/LoopVectorize/metadata-width.ll Removed: diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h index db215094a7e49..c2b3446d159f2 100644 --- a/llvm/include/llvm/IR/IRBuilder.h +++ b/llvm/include/llvm/IR/IRBuilder.h @@ -879,6 +879,10 @@ class IRBuilderBase { Type *ResultType, const Twine = ""); + /// Create a call to llvm.vscale, multiplied by \p Scaling. The type of VScale + /// will be the same type as that of \p Scaling. + Value *CreateVScale(Constant *Scaling, const Twine = ""); + /// Create a call to intrinsic \p ID with 1 operand which is mangled on its /// type. CallInst *CreateUnaryIntrinsic(Intrinsic::ID ID, Value *V, diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp index c0e4451f52003..f936f5756b6f0 100644 --- a/llvm/lib/IR/IRBuilder.cpp +++ b/llvm/lib/IR/IRBuilder.cpp @@ -80,6 +80,17 @@ static CallInst *createCallHelper(Function *Callee, ArrayRef Ops, return CI; } +Value *IRBuilderBase::CreateVScale(Constant *Scaling, const Twine ) { + Module *M = GetInsertBlock()->getParent()->getParent(); + assert(isa(Scaling) && "Expected constant integer"); + Function *TheFn = + Intrinsic::getDeclaration(M, Intrinsic::vscale, {Scaling->getType()}); + CallInst *CI = createCallHelper(TheFn, {}, this, Name); + return cast(Scaling)->getSExtValue() == 1 + ? CI + : CreateMul(CI, Scaling); +} + CallInst *IRBuilderBase::CreateMemSet(Value *Ptr, Value *Val, Value *Size, MaybeAlign Align, bool isVolatile, MDNode *TBAATag, MDNode *ScopeTag, diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index f504afd1ffc41..a91fb988badf6 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1121,6 +1121,15 @@ static OptimizationRemarkAnalysis createLVAnalysis(const char *PassName, return R; } +/// Return a value for Step multiplied by VF. +static Value *createStepForVF(IRBuilder<> , Constant *Step, ElementCount VF) { + assert(isa(Step) && "Expected an integer step"); + Constant *StepVal = ConstantInt::get( + Step->getType(), + cast(Step)->getSExtValue() * VF.getKnownMinValue()); + return VF.isScalable() ? B.CreateVScale(StepVal) : StepVal; +} + namespace llvm { void reportVectorizationFailure(const StringRef DebugMsg, @@ -2277,8 +2286,6 @@ void InnerLoopVectorizer::buildScalarSteps(Value *ScalarIV, Value *Step, const InductionDescriptor ) { // We shouldn't have to build scalar steps if we aren't vectorizing. assert(VF.isVector() && "VF should be greater than one"); - assert(!VF.isScalable() && - "the code below assumes a fixed number of elements at compile time"); // Get the value type and ensure it and the step have the same integer type. Type *ScalarIVTy = ScalarIV->getType()->getScalarType(); assert(ScalarIVTy == Step->getType() && @@ -2303,11 +2310,24 @@ void InnerLoopVectorizer::buildScalarSteps(Value *ScalarIV, Value *Step, Cost->isUniformAfterVectorization(cast(EntryVal), VF) ? 1 : VF.getKnownMinValue(); + assert((!VF.isScalable() || Lanes == 1) && + "Should never scalarize a scalable vector"); // Compute the scalar steps and save the results in VectorLoopValueMap. for (unsigned Part = 0; Part < UF; ++Part) { for (unsigned Lane = 0; Lane < Lanes; ++Lane) { - auto *StartIdx =
[llvm-branch-commits] [llvm] adc3714 - [LoopVectorizer] NFC: Remove unnecessary asserts that VF cannot be scalable.
Author: Sander de Smalen Date: 2020-12-09T11:25:21Z New Revision: adc37145dec9cadf76af05326150ed22a3cc2fdd URL: https://github.com/llvm/llvm-project/commit/adc37145dec9cadf76af05326150ed22a3cc2fdd DIFF: https://github.com/llvm/llvm-project/commit/adc37145dec9cadf76af05326150ed22a3cc2fdd.diff LOG: [LoopVectorizer] NFC: Remove unnecessary asserts that VF cannot be scalable. This patch removes a number of asserts that VF is not scalable, even though the code where this assert lives does nothing that prevents VF being scalable. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91060 Added: Modified: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Removed: diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index 6ba14e942ff8..f504afd1ffc4 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -367,7 +367,6 @@ static Type *getMemInstValueType(Value *I) { /// type is irregular if its allocated size doesn't equal the store size of an /// element of the corresponding vector type at the given vectorization factor. static bool hasIrregularType(Type *Ty, const DataLayout , ElementCount VF) { - assert(!VF.isScalable() && "scalable vectors not yet supported."); // Determine if an array of VF elements of type Ty is "bitcast compatible" // with a vector. if (VF.isVector()) { @@ -1387,9 +1386,7 @@ class LoopVectorizationCostModel { /// width \p VF. Return CM_Unknown if this instruction did not pass /// through the cost modeling. InstWidening getWideningDecision(Instruction *I, ElementCount VF) { -assert(!VF.isScalable() && "scalable vectors not yet supported."); -assert(VF.isVector() && "Expected VF >=2"); - +assert(VF.isVector() && "Expected VF to be a vector VF"); // Cost model is not run in the VPlan-native path - return conservative // result until this changes. if (EnableVPlanNativePath) @@ -3902,8 +3899,10 @@ void InnerLoopVectorizer::fixVectorizedLoop() { // profile is not inherently precise anyway. Note also possible bypass of // vector code caused by legality checks is ignored, assigning all the weight // to the vector loop, optimistically. - assert(!VF.isScalable() && - "cannot use scalable ElementCount to determine unroll factor"); + // + // For scalable vectorization we can't know at compile time how many iterations + // of the loop are handled in one vector iteration, so instead assume a pessimistic + // vscale of '1'. setProfileInfoAfterUnrolling( LI->getLoopFor(LoopScalarBody), LI->getLoopFor(LoopVectorBody), LI->getLoopFor(LoopScalarBody), VF.getKnownMinValue() * UF); @@ -4709,7 +4708,6 @@ static bool mayDivideByZero(Instruction ) { void InnerLoopVectorizer::widenInstruction(Instruction , VPValue *Def, VPUser , VPTransformState ) { - assert(!VF.isScalable() && "scalable vectors not yet supported."); switch (I.getOpcode()) { case Instruction::Call: case Instruction::Br: @@ -4797,7 +4795,6 @@ void InnerLoopVectorizer::widenInstruction(Instruction , VPValue *Def, setDebugLocFromInst(Builder, CI); /// Vectorize casts. -assert(!VF.isScalable() && "VF is assumed to be non scalable."); Type *DestTy = (VF.isScalar()) ? CI->getType() : VectorType::get(CI->getType(), VF); @@ -5099,7 +5096,6 @@ void LoopVectorizationCostModel::collectLoopScalars(ElementCount VF) { bool LoopVectorizationCostModel::isScalarWithPredication(Instruction *I, ElementCount VF) { - assert(!VF.isScalable() && "scalable vectors not yet supported."); if (!blockNeedsPredication(I->getParent())) return false; switch(I->getOpcode()) { @@ -6420,7 +6416,6 @@ int LoopVectorizationCostModel::computePredInstDiscount( LoopVectorizationCostModel::VectorizationCostTy LoopVectorizationCostModel::expectedCost(ElementCount VF) { - assert(!VF.isScalable() && "scalable vectors not yet supported."); VectorizationCostTy Cost; // For each block. @@ -7935,7 +7930,6 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, VFRange , "Must be called with either a load or store"); auto willWiden = [&](ElementCount VF) -> bool { -assert(!VF.isScalable() && "unexpected scalable ElementCount"); if (VF.isScalar()) return false; LoopVectorizationCostModel::InstWidening Decision = ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits