Hi, This patch is to add a test case similar to the one in i386 to add testing coverage for 510.parest_r hotspots.
As evaluated, the emulated gather capability of vectorizer (r12-2733) can help to speed up SPEC2017 510.parest_r on Power8/9/10 by 5% to 9% with option sets Ofast unroll and Ofast lto. But since rs6000 missed unpacking support for unsigned int before, it can only vectorize the hotspots until r12-3134. By checking why r12-2733 doesn't immediately show its impact for SPEC2017 510.parest_r while the associated test case already can get vectorized on rs6000 at that time, I realized the associated test case use int as INDEXTYPE while the hotspots actually use unsigned int. So different from the one in i386, this patch uses unsigned int as INDEXTYPE since the unpack support for unsigned int (r12-3134) also matters for the hotspots vectorization. Not sure if it's worth to updating the one in i386 as well? Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. Is it ok for trunk? BR, Kewen ----- gcc/testsuite/ChangeLog: * gcc.target/powerpc/vect-gather-1.c: New test. diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c new file mode 100644 index 00000000000..bf98045ab03 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* Profitable from Power8 since it supports efficient unaligned load. */ +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ + +#ifndef INDEXTYPE +#define INDEXTYPE unsigned int +#endif +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, + double *luval, double *dst) +{ + double res = 0; + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) + res += *luval * dst[*col]; + return res; +} + +/* With gather emulation this should be profitable to vectorize from Power8. */ +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ +/* The index vector loads and promotions should be scalar after forwprop. */ +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */ -- 2.25.1