================
@@ -2955,12 +2966,14 @@ 
tryToMatchAndCreateMulAccumulateReduction(VPReductionRecipe *Red,
 
     // Match reduce.add(mul(ext, ext)).
     if (RecipeA && RecipeB &&
-        (RecipeA->getOpcode() == RecipeB->getOpcode() || A == B) &&
+        (RecipeA->getOpcode() == RecipeB->getOpcode() || A == B ||
+         IsPartialReduction) &&
         match(RecipeA, m_ZExtOrSExt(m_VPValue())) &&
         match(RecipeB, m_ZExtOrSExt(m_VPValue())) &&
-        IsMulAccValidAndClampRange(RecipeA->getOpcode() ==
-                                       Instruction::CastOps::ZExt,
-                                   MulR, RecipeA, RecipeB, nullptr, Sub)) {
+        (IsPartialReduction ||
----------------
SamTebbs33 wrote:

Currently, we collect the scaled reductions in `collectScaledReductions` in 
`LoopVectorize.cpp` and clamp the range there with those then becoming partial 
reductions later on in `LoopVectorize.cpp` via `tryToCreatePartialReduction`. 
This all happens before the abstract recipe conversion transform runs so, as  
things stand, we need to have created the partial reductions before this 
transform pass. If we were to move the clamping from `LoopVectorize.cpp` to 
here, then (in `LoopVectorize.cpp`) we'd have to create partial reductions for 
all VFs or none of them, which won't work well.

My ideal outcome is to move the clamping and partial reduction creation code to 
this transform pass, but that will be a bigger change that is outside of the 
scope of this PR. Having the shortcut here should be fine since the VF ranges 
have already been clamped properly so the plan state is valid.

https://github.com/llvm/llvm-project/pull/147302
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to