[PATCH v3 0/6] rs6000: Support more SSE4 intrinsics

2021-08-23 Thread Paul A. Clarke via Gcc-patches
v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2
and users will expect to be able to include "nmmintrin.h",
even though "nmmintrin.h" just includes "smmintrin.h"
where all of the SSE4.2 implementations actually appear.

Only patch 5/6 changed from v2.

Tested ppc64le (POWER9) and ppc64/32 (POWER7).

OK for trunk?

Paul A. Clarke (6):
  rs6000: Support SSE4.1 "round" intrinsics
  rs6000: Support SSE4.1 "min" and "max" intrinsics
  rs6000: Simplify some SSE4.1 "test" intrinsics
  rs6000: Support SSE4.1 "cvt" intrinsics
  rs6000: Support more SSE4 "cmp", "mul", "pack" intrinsics
  rs6000: Guard some x86 intrinsics implementations

 gcc/config/rs6000/emmintrin.h |  12 +-
 gcc/config/rs6000/nmmintrin.h |  40 ++
 gcc/config/rs6000/pmmintrin.h |   4 +
 gcc/config/rs6000/smmintrin.h | 427 --
 gcc/config/rs6000/tmmintrin.h |  12 +
 gcc/testsuite/gcc.target/powerpc/pr78102.c|  23 +
 .../gcc.target/powerpc/sse4_1-packusdw.c  |  73 +++
 .../gcc.target/powerpc/sse4_1-pcmpeqq.c   |  46 ++
 .../gcc.target/powerpc/sse4_1-pmaxsb.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pmaxsd.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pmaxud.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pmaxuw.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pminsb.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pminsd.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pminud.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pminuw.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbd.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbw.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxdq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxwd.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxwq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbd.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbw.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxdq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxwd.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxwq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmuldq.c|  51 +++
 .../gcc.target/powerpc/sse4_1-pmulld.c|  46 ++
 .../gcc.target/powerpc/sse4_1-round3.h|  81 
 .../gcc.target/powerpc/sse4_1-roundpd.c   | 143 ++
 .../gcc.target/powerpc/sse4_1-roundps.c   |  98 
 .../gcc.target/powerpc/sse4_1-roundsd.c   | 256 +++
 .../gcc.target/powerpc/sse4_1-roundss.c   | 208 +
 .../gcc.target/powerpc/sse4_2-check.h |  18 +
 .../gcc.target/powerpc/sse4_2-pcmpgtq.c   |  46 ++
 37 files changed, 2407 insertions(+), 59 deletions(-)
 create mode 100644 gcc/config/rs6000/nmmintrin.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr78102.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pcmpeqq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxud.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxuw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsb.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminud.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminuw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxdq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxdq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmuldq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmulld.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c
 create mode 100644 gcc/testsuite/gcc.target/

Re: [PATCH v3 0/6] rs6000: Support more SSE4 intrinsics

2021-09-16 Thread Paul A. Clarke via Gcc-patches
Ping.

On Mon, Aug 23, 2021 at 02:03:04PM -0500, Paul A. Clarke via Gcc-patches wrote:
> v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2
> and users will expect to be able to include "nmmintrin.h",
> even though "nmmintrin.h" just includes "smmintrin.h"
> where all of the SSE4.2 implementations actually appear.
> 
> Only patch 5/6 changed from v2.
> 
> Tested ppc64le (POWER9) and ppc64/32 (POWER7).
> 
> OK for trunk?
> 
> Paul A. Clarke (6):
>   rs6000: Support SSE4.1 "round" intrinsics
>   rs6000: Support SSE4.1 "min" and "max" intrinsics
>   rs6000: Simplify some SSE4.1 "test" intrinsics
>   rs6000: Support SSE4.1 "cvt" intrinsics
>   rs6000: Support more SSE4 "cmp", "mul", "pack" intrinsics
>   rs6000: Guard some x86 intrinsics implementations
> 
>  gcc/config/rs6000/emmintrin.h |  12 +-
>  gcc/config/rs6000/nmmintrin.h |  40 ++
>  gcc/config/rs6000/pmmintrin.h |   4 +
>  gcc/config/rs6000/smmintrin.h | 427 --
>  gcc/config/rs6000/tmmintrin.h |  12 +
>  gcc/testsuite/gcc.target/powerpc/pr78102.c|  23 +
>  .../gcc.target/powerpc/sse4_1-packusdw.c  |  73 +++
>  .../gcc.target/powerpc/sse4_1-pcmpeqq.c   |  46 ++
>  .../gcc.target/powerpc/sse4_1-pmaxsb.c|  46 ++
>  .../gcc.target/powerpc/sse4_1-pmaxsd.c|  46 ++
>  .../gcc.target/powerpc/sse4_1-pmaxud.c|  47 ++
>  .../gcc.target/powerpc/sse4_1-pmaxuw.c|  47 ++
>  .../gcc.target/powerpc/sse4_1-pminsb.c|  46 ++
>  .../gcc.target/powerpc/sse4_1-pminsd.c|  46 ++
>  .../gcc.target/powerpc/sse4_1-pminud.c|  47 ++
>  .../gcc.target/powerpc/sse4_1-pminuw.c|  47 ++
>  .../gcc.target/powerpc/sse4_1-pmovsxbd.c  |  42 ++
>  .../gcc.target/powerpc/sse4_1-pmovsxbq.c  |  42 ++
>  .../gcc.target/powerpc/sse4_1-pmovsxbw.c  |  42 ++
>  .../gcc.target/powerpc/sse4_1-pmovsxdq.c  |  42 ++
>  .../gcc.target/powerpc/sse4_1-pmovsxwd.c  |  42 ++
>  .../gcc.target/powerpc/sse4_1-pmovsxwq.c  |  42 ++
>  .../gcc.target/powerpc/sse4_1-pmovzxbd.c  |  43 ++
>  .../gcc.target/powerpc/sse4_1-pmovzxbq.c  |  43 ++
>  .../gcc.target/powerpc/sse4_1-pmovzxbw.c  |  43 ++
>  .../gcc.target/powerpc/sse4_1-pmovzxdq.c  |  43 ++
>  .../gcc.target/powerpc/sse4_1-pmovzxwd.c  |  43 ++
>  .../gcc.target/powerpc/sse4_1-pmovzxwq.c  |  43 ++
>  .../gcc.target/powerpc/sse4_1-pmuldq.c|  51 +++
>  .../gcc.target/powerpc/sse4_1-pmulld.c|  46 ++
>  .../gcc.target/powerpc/sse4_1-round3.h|  81 
>  .../gcc.target/powerpc/sse4_1-roundpd.c   | 143 ++
>  .../gcc.target/powerpc/sse4_1-roundps.c   |  98 
>  .../gcc.target/powerpc/sse4_1-roundsd.c   | 256 +++
>  .../gcc.target/powerpc/sse4_1-roundss.c   | 208 +
>  .../gcc.target/powerpc/sse4_2-check.h |  18 +
>  .../gcc.target/powerpc/sse4_2-pcmpgtq.c   |  46 ++
>  37 files changed, 2407 insertions(+), 59 deletions(-)
>  create mode 100644 gcc/config/rs6000/nmmintrin.h
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr78102.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pcmpeqq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxud.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxuw.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsb.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminud.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminuw.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbw.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxdq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbw.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxdq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwd.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmuldq.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmulld.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.

Re: [PATCH v3 0/6] rs6000: Support more SSE4 intrinsics

2021-10-04 Thread Paul A. Clarke via Gcc-patches
Ping.

On Thu, Sep 16, 2021 at 09:59:39AM -0500, Paul A. Clarke via Gcc-patches wrote:
> Ping.
> 
> On Mon, Aug 23, 2021 at 02:03:04PM -0500, Paul A. Clarke via Gcc-patches 
> wrote:
> > v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2
> > and users will expect to be able to include "nmmintrin.h",
> > even though "nmmintrin.h" just includes "smmintrin.h"
> > where all of the SSE4.2 implementations actually appear.
> > 
> > Only patch 5/6 changed from v2.
> > 
> > Tested ppc64le (POWER9) and ppc64/32 (POWER7).
> > 
> > OK for trunk?
> > 
> > Paul A. Clarke (6):
> >   rs6000: Support SSE4.1 "round" intrinsics
> >   rs6000: Support SSE4.1 "min" and "max" intrinsics
> >   rs6000: Simplify some SSE4.1 "test" intrinsics
> >   rs6000: Support SSE4.1 "cvt" intrinsics
> >   rs6000: Support more SSE4 "cmp", "mul", "pack" intrinsics
> >   rs6000: Guard some x86 intrinsics implementations
> > 
> >  gcc/config/rs6000/emmintrin.h |  12 +-
> >  gcc/config/rs6000/nmmintrin.h |  40 ++
> >  gcc/config/rs6000/pmmintrin.h |   4 +
> >  gcc/config/rs6000/smmintrin.h | 427 --
> >  gcc/config/rs6000/tmmintrin.h |  12 +
> >  gcc/testsuite/gcc.target/powerpc/pr78102.c|  23 +
> >  .../gcc.target/powerpc/sse4_1-packusdw.c  |  73 +++
> >  .../gcc.target/powerpc/sse4_1-pcmpeqq.c   |  46 ++
> >  .../gcc.target/powerpc/sse4_1-pmaxsb.c|  46 ++
> >  .../gcc.target/powerpc/sse4_1-pmaxsd.c|  46 ++
> >  .../gcc.target/powerpc/sse4_1-pmaxud.c|  47 ++
> >  .../gcc.target/powerpc/sse4_1-pmaxuw.c|  47 ++
> >  .../gcc.target/powerpc/sse4_1-pminsb.c|  46 ++
> >  .../gcc.target/powerpc/sse4_1-pminsd.c|  46 ++
> >  .../gcc.target/powerpc/sse4_1-pminud.c|  47 ++
> >  .../gcc.target/powerpc/sse4_1-pminuw.c|  47 ++
> >  .../gcc.target/powerpc/sse4_1-pmovsxbd.c  |  42 ++
> >  .../gcc.target/powerpc/sse4_1-pmovsxbq.c  |  42 ++
> >  .../gcc.target/powerpc/sse4_1-pmovsxbw.c  |  42 ++
> >  .../gcc.target/powerpc/sse4_1-pmovsxdq.c  |  42 ++
> >  .../gcc.target/powerpc/sse4_1-pmovsxwd.c  |  42 ++
> >  .../gcc.target/powerpc/sse4_1-pmovsxwq.c  |  42 ++
> >  .../gcc.target/powerpc/sse4_1-pmovzxbd.c  |  43 ++
> >  .../gcc.target/powerpc/sse4_1-pmovzxbq.c  |  43 ++
> >  .../gcc.target/powerpc/sse4_1-pmovzxbw.c  |  43 ++
> >  .../gcc.target/powerpc/sse4_1-pmovzxdq.c  |  43 ++
> >  .../gcc.target/powerpc/sse4_1-pmovzxwd.c  |  43 ++
> >  .../gcc.target/powerpc/sse4_1-pmovzxwq.c  |  43 ++
> >  .../gcc.target/powerpc/sse4_1-pmuldq.c|  51 +++
> >  .../gcc.target/powerpc/sse4_1-pmulld.c|  46 ++
> >  .../gcc.target/powerpc/sse4_1-round3.h|  81 
> >  .../gcc.target/powerpc/sse4_1-roundpd.c   | 143 ++
> >  .../gcc.target/powerpc/sse4_1-roundps.c   |  98 
> >  .../gcc.target/powerpc/sse4_1-roundsd.c   | 256 +++
> >  .../gcc.target/powerpc/sse4_1-roundss.c   | 208 +
> >  .../gcc.target/powerpc/sse4_2-check.h |  18 +
> >  .../gcc.target/powerpc/sse4_2-pcmpgtq.c   |  46 ++
> >  37 files changed, 2407 insertions(+), 59 deletions(-)
> >  create mode 100644 gcc/config/rs6000/nmmintrin.h
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr78102.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pcmpeqq.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsd.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxud.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxuw.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsb.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsd.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminud.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminuw.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbd.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbq.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbw.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxdq.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwd.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwq.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbd.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbq.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbw.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxdq.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwd.c
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwq.c
> >  create mode 100644 g

Re: [PATCH v3 0/6] rs6000: Support more SSE4 intrinsics

2021-10-07 Thread Segher Boessenkool
Hi!

On Mon, Aug 23, 2021 at 02:03:04PM -0500, Paul A. Clarke wrote:
> v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2

There should not be a "v3" in the commit message.  The easy way to
achieve this is put it inside the [] in the subject (as you did), and to
mention the version history after a --- (see --notes for git-format-patch
for example).

> Tested ppc64le (POWER9) and ppc64/32 (POWER7).

Please write the full triples -- well at least enough that they are
usable, like, powerpc64-linux.  I'll assume you tested on Linux :-)


Segher


Re: [PATCH v3 0/6] rs6000: Support more SSE4 intrinsics

2021-10-07 Thread Paul A. Clarke via Gcc-patches
On Thu, Oct 07, 2021 at 05:25:54PM -0500, Segher Boessenkool wrote:
> On Mon, Aug 23, 2021 at 02:03:04PM -0500, Paul A. Clarke wrote:
> > v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2
> 
> There should not be a "v3" in the commit message.  The easy way to
> achieve this is put it inside the [] in the subject (as you did), and to
> mention the version history after a --- (see --notes for git-format-patch
> for example).

This is just a cover letter. Does it matter in that context?
(I have done as described in the patches which followed.)

> > Tested ppc64le (POWER9) and ppc64/32 (POWER7).
> 
> Please write the full triples -- well at least enough that they are
> usable, like, powerpc64-linux.  I'll assume you tested on Linux :-)

Yes, sorry.  All are "-linux", and I'll try to remember that for next time.

PC


Re: [PATCH v3 0/6] rs6000: Support more SSE4 intrinsics

2021-10-11 Thread Segher Boessenkool
On Thu, Oct 07, 2021 at 07:29:26PM -0500, Paul A. Clarke wrote:
> On Thu, Oct 07, 2021 at 05:25:54PM -0500, Segher Boessenkool wrote:
> > On Mon, Aug 23, 2021 at 02:03:04PM -0500, Paul A. Clarke wrote:
> > > v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2
> > 
> > There should not be a "v3" in the commit message.  The easy way to
> > achieve this is put it inside the [] in the subject (as you did), and to
> > mention the version history after a --- (see --notes for git-format-patch
> > for example).
> 
> This is just a cover letter. Does it matter in that context?

Ha no, it just confused me apparently :-)


Segher