[Bug target/100929] gcc fails to optimize less to min for SIMD code

denis.yaroshevskij at gmail dot com via Gcc-bugs Sun, 06 Jun 2021 15:29:50 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929


--- Comment #3 from Denis Yaroshevskiy <denis.yaroshevskij at gmail dot com> ---
> Please attach your testcases to the bug report.

Is what @Andrew Pinski copied enough? I can attach the same code as file.

> I don't know if there would be issues for comparisons (with -ftrapping-math 
> for instance?).

-ftrapping-math causes clang to stop doing this optimisation.

I can see that clang does it, so I assume `nans` are OK without this flag. For
ints this is for sure OK.

> Note the other testcase is using eve which I have no idea what it is coming 
> from.

Using eve just was much easier then writing this with intrinsics:

The point was:

        vpcmpgtd        ymm2, ymm0, ymm1
        vpblendvb       ymm0, ymm0, ymm1, ymm2

should become

        vpminsd ymm0, ymm1, ymm0

And on arm:

        cmgt    v2.4s, v0.4s, v1.4s
        bit     v0.16b, v1.16b, v2.16b

should become
       smin    v0.4s, v1.4s, v0.4s

And
        fcmgt   v2.4s, v0.4s, v1.4s
        bit     v0.16b, v1.16b, v2.16b

should become
       fmin    v0.4s, v1.4s, v0.4s


I don't really know how it is done in `gcc` - but all these examples look like
the same issue. If it is very helpful to write all of them as intrinsics, I
can.

[Bug target/100929] gcc fails to optimize less to min for SIMD code

Reply via email to