================
@@ -2955,12 +2966,14 @@
tryToMatchAndCreateMulAccumulateReduction(VPReductionRecipe *Red,
// Match reduce.add(mul(ext, ext)).
if (RecipeA && RecipeB &&
- (RecipeA->getOpcode() == RecipeB->getOpcode() || A == B) &&
+ (RecipeA->getOpcode() == RecipeB->getOpcode() || A == B ||
+ IsPartialReduction) &&
match(RecipeA, m_ZExtOrSExt(m_VPValue())) &&
match(RecipeB, m_ZExtOrSExt(m_VPValue())) &&
- IsMulAccValidAndClampRange(RecipeA->getOpcode() ==
- Instruction::CastOps::ZExt,
- MulR, RecipeA, RecipeB, nullptr, Sub)) {
+ (IsPartialReduction ||
----------------
SamTebbs33 wrote:
Currently, we collect the scaled reductions in `collectScaledReductions` in
`LoopVectorize.cpp` and clamp the range there with those then becoming partial
reductions later on in `LoopVectorize.cpp` via `tryToCreatePartialReduction`.
This all happens before the abstract recipe conversion transform runs so, as
things stand, we need to have created the partial reductions before this
transform pass. If we were to move the clamping from `LoopVectorize.cpp` to
here, then (in `LoopVectorize.cpp`) we'd have to create partial reductions for
all VFs or none of them, which won't work well.
My ideal outcome is to move the clamping and partial reduction creation code to
this transform pass, but that will be a bigger change that is outside of the
scope of this PR. Having the shortcut here should be fine since the VF ranges
have already been clamped properly so the plan state is valid.
https://github.com/llvm/llvm-project/pull/147302
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits