https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120396
Bug ID: 120396
Summary: unprofitable SLP vectorization, leaves scalar parts
live
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: amonakov at gcc dot gnu.org
Target Milestone: ---
A variant of PR 109892.
static double muladd(double x, double y, double z)
{
return x * y + z;
}
double g(double x[], long n)
{
double r0 = 0, r1 = 0;
for (; n; x += 2, n--) {
r0 = muladd(x[0], x[0], r0);
r1 = muladd(x[1], x[1], r1);
x[0] = r0;
x[1] = r1;
}
return r0 + r1;
}
The SLP-vectorized loop at -O2 -mfma (or plain -O2 on AArch64) does strictly
more work than a scalar loop.