[Bug target/118151] New: Relax the SVE PTEST matching conditions for any/none (ne/eq)

rsandifo at gcc dot gnu.org via Gcc-bugs Fri, 20 Dec 2024 04:53:47 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118151


            Bug ID: 118151
           Summary: Relax the SVE PTEST matching conditions for any/none
                    (ne/eq)
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: aarch64-sve, missed-optimization
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
                CC: tnfchris at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*-*-*

All our current PTEST combiner patterns are for the general CC_NZC case, where
the eventual condition could be first/not-first/last/not-last/any/none.  For
this general case, it's only usually possible to fold a PTEST with a previous
(potential) flag-setting instruction if both instructions have the same
governing predicate.

However, for the simple any/none (ne/eq) case, it's enough for the PTEST gp to
be a superset of the other instruction's gp.  In particular, we can always fold
if the PTEST is predicated on a PTRUE for the same element width or narrower. 
The failure to handle this case is causing us to miss many folds, both in ACLE
code and in early-break tests.

I think it could be handled by using CC_Z for ne/eq and relaxing
aarch64_sve_same_pred_for_ptest_p for that case.  It might even be a relatively
simple change.

For example:

#include <arm_sve.h>

int
foo (svbool_t pg, svint32_t x, svint32_t y)
{
  return svptest_any(svptrue_b8(), svcmpeq(pg, x, y));
}

currently generates:

        ptrue   p3.b, all
        cmpeq   p0.s, p0/z, z0.s, z1.s
        ptest   p3, p0.b
        cset    w0, any
        ret

where the ptest and ptrue are redundant.  The same is true with svptrue_b8
replaced by svptrue_b16 or svptrue_b32, but not with svptrue_b64.  (LLVM
optimises the svptrue_b32 case, but not the others.)

We should try to make it so that two tests of the same result, such as
svptest_last and svptest_any, both still use the same PTEST, even if they
initially use different CC modes.

[Bug target/118151] New: Relax the SVE PTEST matching conditions for any/none (ne/eq)

Reply via email to