https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106326

            Bug ID: 106326
           Summary: _m and _z version of SVE instrinsics not optimized to
                    predicate-free version
           Product: gcc
           Version: 12.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yyc1992 at gmail dot com
  Target Milestone: ---

The following code should generate a predicate-free fadd instruction since all
the predicates are true.

```
svfloat64_t test(svfloat64_t a, svfloat64_t b)
{
    return svadd_m(svptrue_b64(), a, b);
}
```

but gcc instead generates an all-tree predicate and use that instead, i.e.

```
        ptrue   p0.b, all
        fadd    z0.d, p0/m, z0.d, z1.d
```

The same happens for the `_z` version as well with even worse code generated.

```
        ptrue   p0.b, all
        movprfx z0.d, p0/z, z0.d
        fadd    z0.d, p0/m, z0.d, z1.d
```

This optimization is only done for the `_x` variance. Clang optimizes this for
all variance.

Reply via email to