Hi Kelvin,

On Fri, Mar 08, 2019 at 10:59:35AM -0600, Kelvin Nilsen wrote:
> This problem report, though initially motivated by differences in behavior 
> between constant and non-constant selector arguments, uncovered a number of 
> inconsistencies in the implementation of vec_extract.
> 
> This patch provides several fixes to make handling of constant selector 
> expressions the same as the handling of non-constant selector expressions.  
> In the process of testing, it was observed that certain existing regression 
> tests were looking for the wrong instructions to be emitted and those tests 
> have been updated.
> 
> This has bootstrapped and tested without regressions on 
> powerpc64le-unknown-linux (both P8 and P9) and on powerpc-linux (P7 
> big-endian, with both -m32 and -m64 target options).

Thanks for the patch.  I have lots of comments/questions, mostly about the
testcases.  Sorry :-/  The actual compiler code part looks good though.


> --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-10a.c        (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-10a.c        (revision 0)
> @@ -0,0 +1,157 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Is there any reason to think this testcase will fail on Darwin?  I mean, it
requires VSX, and that should skip the testcase already on Darwin?

> +/* { dg-require-effective-target dfp_hw } */

Why is that?  I don't see any dfp used here?

> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { } } 
> */

You do not use -mcpu=, so why this dg-skip-if?  And please set
-mdejagnu-cpu= instead.

> --- gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c       
> (revision 269370)
> +++ gcc/testsuite/gcc.target/powerpc/fold-vec-extract-char.p8.c       
> (working copy)
> @@ -18,7 +18,7 @@
>  /* { dg-final { scan-assembler-times {\mxxpermdi\M} 3 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {\mvslo\M} 3 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {\mmfvsrd\M} 6 { target lp64 } } } */
> -/* { dg-final { scan-assembler-times {\msradi\M} 3 { target lp64 } } } */
> +/* { dg-final { scan-assembler-times {\msrdi\M} 3 { target lp64 } } } */

Maybe allow both?  So {\msra?di\M}?  Or does sradi make no sense here?

> --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-14a.c        (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-14a.c        (revision 0)
> @@ -0,0 +1,128 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> +/* { dg-require-effective-target dfp_hw } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { } } 
> */
> +/* { dg-options "-maltivec" } */
> +
> +/* This test should run the same on any target that supports altivec/dfp
> +   instructions.  Intentionally not specifying cpu in order to test
> +   all code generation paths.  */

I don't see where dfp comes in?  For server CPUs it is p6 and up, the same
as AltiVec, but that is coincidental.

> --- gcc/testsuite/gcc.target/powerpc/pr87532-mc.c     (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/pr87532-mc.c     (revision 0)
> @@ -0,0 +1,260 @@
> +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */

Check for int128, instead?  Or is there another reason to want lp64?

> +  __asm__ (" # %x0" : "+wa" (a));

"wa" requires VSX.

> --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-14b.c        (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vsx-builtin-14b.c        (revision 0)
> @@ -0,0 +1,128 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */

Btw, you can just say { dg-do run }: everything in gcc.target/powerpc is
only run for powerpc*-*-* targets.

> --- gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2b.c (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2b.c (revision 0)
> @@ -0,0 +1,16 @@
> +/* { dg-do run { target { powerpc*-*-linux* } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> +/* { dg-require-effective-target vsx_hw } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { } } 
> */
> +/* { dg-options "-O2 -mvsx -save-temps -dp -g" } */

I think that is debugging code left over?  -dp -g should be harmless, but
-save-temps is littering ;-)

> --- gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2.h  (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-v2.h  (revision 0)

> +static vector TYPE
> +deoptimize (vector TYPE a)
> +{
> +  __asm__ (" # %x0" : "+wa" (a));
> +  return a;
> +}

Is this only ever compiled with VSX?  If not, use "v" instead?

> --- gcc/config/rs6000/rs6000-c.c      (revision 269370)
> +++ gcc/config/rs6000/rs6000-c.c      (working copy)
> @@ -6568,9 +6568,15 @@
>         /* If the second argument is an integer constant, if the value is in
>            the expected range, generate the built-in code if we can.  We need
>            64-bit and direct move to extract the small integer vectors.  */
> -       if (TREE_CODE (arg2) == INTEGER_CST
> -           && wi::ltu_p (wi::to_wide (arg2), nunits))
> +
> +       if (TREE_CODE (arg2) == INTEGER_CST)
>           {
> +           if (!wi::ltu_p (wi::to_wide (arg2), nunits))
> +             {
> +               wide_int selector = wi::to_wide (arg2);
> +               selector = wi::umod_trunc (selector, nunits);
> +               arg2 = wide_int_to_tree (TREE_TYPE (arg2), selector);
> +             }

You can just always do this, no need for the condition?  Makes it easier
to read.

> --- gcc/config/rs6000/rs6000.c        (revision 269370)
> +++ gcc/config/rs6000/rs6000.c        (working copy)
> @@ -6894,7 +6894,6 @@
>       default:
>         break;
>       case E_V1TImode:
> -       gcc_assert (INTVAL (elt) == 0 && inner_mode == TImode);

You could leave the inner_mode part?  Or is the assert here pretty
useless anyway?

> @@ -7173,7 +7190,9 @@
>  
>    else if (REG_P (src) || SUBREG_P (src))
>      {
> -      int bit_shift = byte_shift + 3;
> +      int num_elements = GET_MODE_NUNITS (mode);
> +      int bits_in_element = 128 / num_elements;

Use GET_MODE_INNER?  Or will that not help here.

> @@ -14724,8 +14743,18 @@
>    op1 = expand_normal (arg1);
>  
>    /* Call get_element_number to validate arg1 if it is a constant.  */
> +
>    if (TREE_CODE (arg1) == INTEGER_CST)
> -    (void) get_element_number (TREE_TYPE (arg0), arg1);
> +    {
> +      unsigned HOST_WIDE_INT elt;
> +      unsigned HOST_WIDE_INT size = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
> +      unsigned int truncated_selector;
> +      if (!tree_fits_uhwi_p (arg1))
> +     error ("selector must fit in HOST_WIDE_INT");

"error" is a user-visible error, and HOST_WIDE_INT is not something a user
knows.


Segher

Reply via email to