I don't understand how synth-mult works, but it does introduce
multiple uses of a reduction variable which will ultimatively
fail vectorization (or ICE with a pending change). So avoid
applying the pattern. I've tried to do so selectively, possibly
preserving pattern-matching x * 4 as x << 2.
So basically a single replacement stmt should be OK, likewise sth like
tem = -x;
tem = tem << 3;
res = -tem;
so a single use of 'x' remains. Even using x + x for x * 2 when
x << 1 isn't possible does not work. So if only pow2 mults will
work I can probably make a condition on that. Will we synthesize
any more complex appropriate chain?
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
Any comments?
Thanks,
Richard.
* tree-vect-patterns.cc (vect_synth_mult_by_constant): Avoid
in cases that introduce multiple uses of reduction operands.
---
gcc/tree-vect-patterns.cc | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 3fffcac4b3a..fae4b393dff 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -4303,6 +4303,10 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree op,
tree val,
/* Targets that don't support vector shifts but support vector additions
can synthesize shifts that way. */
bool synth_shift_p = !vect_supportable_shift (vinfo, LSHIFT_EXPR, multtype);
+ if (synth_shift_p
+ /* Any multiple use of the reduction operand will break it. */
+ && vect_is_reduction (stmt_vinfo))
+ return NULL;
HOST_WIDE_INT hwval = tree_to_shwi (val);
/* Use MAX_COST here as we don't want to limit the sequence on rtx costs.
@@ -4333,7 +4337,12 @@ vect_synth_mult_by_constant (vec_info *vinfo, tree op,
tree val,
if (alg.op[0] == alg_zero)
accumulator = build_int_cst (multtype, 0);
else
- accumulator = op;
+ {
+ /* Any multiple use of the reduction operand will break it. */
+ if (vect_is_reduction (stmt_vinfo))
+ return NULL;
+ accumulator = op;
+ }
bool needs_fixup = (variant == negate_variant)
|| (variant == add_variant);
--
2.43.0