Re: [PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-17 Thread Segher Boessenkool
On Wed, Nov 17, 2021 at 07:52:38AM -0600, Bill Schmidt wrote:
> >>  - For int_128bit-runnable.c, I chose not to do gimple folding on the 
> >> 128-bit
> >>comparison operations in the new implementation, because doing so 
> >> results in
> >>bad code that splits things into two 64-bit values.  That needs separate
> >>attention; but the point here is, when I did that, I started generating
> >>more of the vcmpequq, vcmpgtsq, and vcmpgtuq instructions.
> > And you now get worse code (albeit in some cases no longer invalid)?
> 
> No, sorry that this wasn't more clear.  The "old" builtins code performs
> gimple folding on 128-bit compares.  This results in correct but very
> inefficient code.  The "new" builtins code has removed the gimple folding
> for 128-bit compares.  This results in directly generating vcmpequq and
> friends, which is the efficient code we're looking for.  This test case
> then needs modification to show we're doing better.  I'll submit this
> separately.

Hrm.  Folding should always be a good thing to do; and folding should
never split an operation on a 128-bit datum into two operations on
64-bit things.  That kind of optimisation cannot be sanely done on
Gimple level: the abstractions are not close enough to the hardware for
that, and the instruction stream is not close at all to what the
eventual machine insns will be.  We have an RTL pass that does this
("subreg"), it runs almost immediately after expand (and two more
times, even again after the split pass).

So there is a generic bug that you counteract with a target bug :-(

> >> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> >> @@ -14,7 +14,7 @@ get_exponent (double *p)
> >>  {
> >>double source = *p;
> >>  
> >> -  return scalar_extract_exp (source); /* { dg-error 
> >> "'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
> >> configuration" } */
> >> +  return scalar_extract_exp (source); /* { dg-error 
> >> "'__builtin_vsx_scalar_extract_exp' requires the" } */
> >>  }
> > The testcase uses __builtin_vec_scalar_extract_exp, so this is not okay.
> 
> Sorry, this is a case of my bad eyesight not identifying this had changed.
> As with the test case (cmpb-3.c) in the 32-bit patch, this error message
> isn't all that the user sees.  There is also a "note" diagnostic that ties
> the generic overload name to the specific underlying builtin name so that
> confusion is avoided.  I'll just submit these separately with a full
> explanation.

Can't you go just two inches further and report the actual builtin used
by the user (which even is documented!), and not cause any confusion?

> > It is not okay to blindly adjust the testcases to accept what the new
> > code does.  This is a regression.  It is okay to have it regressed for a
> > while.  It is also okay to xfail things, if there is no expectation it
> > can be fixed before the next release (or some other suitably big time
> > frame, this isn't an exact science).
> 
> This isn't really a regression, as I'll describe with each patch.

Looking forward to it :-)

> >> --- a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
> >> @@ -10,5 +10,5 @@
> >>  int
> >>  test_byte_in_set (unsigned char b, unsigned long long set_members)
> >>  {
> >> -  return __builtin_byte_in_set (b, set_members); /* { dg-warning 
> >> "implicit declaration of function" } */
> >> +  return __builtin_byte_in_set (b, set_members); /* { dg-error 
> >> "'__builtin_scalar_byte_in_set' requires the" } */
> >>  }
> > Huh.  How can the old warning ever have fired?  Was the builtin not
> > declared on 32-bit before?  Ouch.
> 
> I'll remind myself what changed here, but yes, that's what it looks like --
> an inadvertent problem with the old logic for 32-bit.

In general it is better to always have all builtins (and other
interfaces) declared internally, so that you can give much better error
messages (and so that you get errors if there are conflicts, etc.)

There can be exceptions, but this is not a case like that :-)  (So your
change is great :-) )

> >> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> >> @@ -10,6 +10,6 @@ main ()
> >>int mask;
> >>  
> >>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> >> -  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error 
> >> {argument 3 must be in the range \[0, 15\]} } */
> >> +  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error 
> >> {argument 3 must be a 4-bit unsigned literal} } */
> >>return 0;
> >>  }
> > Hrm, make this say "must be a literal between 0 and 15, inclusive" like
> > the other errors?
> 
> The "n-bit unsigned literal" is the usual case.  I'll provide more explanation
> in the separate patch.

We should use the same formulation always.  I like the 

Re: [PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-17 Thread Bill Schmidt via Gcc-patches


On 11/17/21 6:44 AM, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Nov 16, 2021 at 02:26:22PM -0600, Bill Schmidt wrote:
>> Hi!  I recently submitted [1] to make adjustments to test cases for the new 
>> builtins
>> support, mostly due to error messages changing for consistency.  Thanks for 
>> the
>> previous review.  I've reviewed the reasons for the changes and removed 
>> unrelated
>> changes as requested.
> And the results are?  This is much easier to write up, and to review, if
> you split the patch into pieces with one theme each.  If you do that
> right then most reviews will be rubber-stamping, and some might require
> some thought (and some may even get objections).  The way things are it
> is a puzzle hunt to review this.

Sorry!  I thought I was addressing the issues that came up last time.  I didn't
intend for this to be difficult.  I will break the patch up going forward.

>
>>  - For fold-vect-splat-floatdouble.c and fold-vec-splat-longlong.c, the 
>> existing
>>test cases have some bad tests in them (checking two bits when only one 
>> bit
>>is meaningful).  The new builtin support catches this but the old support 
>> did
>>not.  Removing those bad cases changes some of the scan-assembler-times 
>> expected
>>values.
> Do this is a separate patch then, independent of the series?  With this
> explanation in the commit message.  This is pre-approved.
OK, will do.
>
>>  - For int_128bit-runnable.c, I chose not to do gimple folding on the 128-bit
>>comparison operations in the new implementation, because doing so results 
>> in
>>bad code that splits things into two 64-bit values.  That needs separate
>>attention; but the point here is, when I did that, I started generating
>>more of the vcmpequq, vcmpgtsq, and vcmpgtuq instructions.
> And you now get worse code (albeit in some cases no longer invalid)?

No, sorry that this wasn't more clear.  The "old" builtins code performs
gimple folding on 128-bit compares.  This results in correct but very
inefficient code.  The "new" builtins code has removed the gimple folding
for 128-bit compares.  This results in directly generating vcmpequq and
friends, which is the efficient code we're looking for.  This test case
then needs modification to show we're doing better.  I'll submit this
separately.

>
>
>> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
>> @@ -14,7 +14,7 @@ get_exponent (double *p)
>>  {
>>double source = *p;
>>  
>> -  return scalar_extract_exp (source);   /* { dg-error 
>> "'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
>> configuration" } */
>> +  return scalar_extract_exp (source);   /* { dg-error 
>> "'__builtin_vsx_scalar_extract_exp' requires the" } */
>>  }
> The testcase uses __builtin_vec_scalar_extract_exp, so this is not okay.

Sorry, this is a case of my bad eyesight not identifying this had changed.
As with the test case (cmpb-3.c) in the 32-bit patch, this error message
isn't all that the user sees.  There is also a "note" diagnostic that ties
the generic overload name to the specific underlying builtin name so that
confusion is avoided.  I'll just submit these separately with a full
explanation.

Same applies to the similar cases below.

>
>> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
>> @@ -12,5 +12,5 @@ get_significand (double *p)
>>  {
>>double source = *p;
>>  
>> -  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
>> "'__builtin_vec_scalar_extract_sig' is not supported in this compiler 
>> configuration" } */
>> +  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
>> "'__builtin_vsx_scalar_extract_sig' requires the" } */
>>  }
> This not either.
>
>> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
>> @@ -16,5 +16,5 @@ insert_exponent (unsigned long long int *significand_p,
>>unsigned long long int significand = *significand_p;
>>unsigned long long int exponent = *exponent_p;
>>  
>> -  return scalar_insert_exp (significand, exponent); /* { dg-error 
>> "'__builtin_vec_scalar_insert_exp' is not supported in this compiler 
>> configuration" } */
>> +  return scalar_insert_exp (significand, exponent); /* { dg-error 
>> "'__builtin_vsx_scalar_insert_exp' requires the" } */
> Or this.
>
>> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
>> @@ -16,5 +16,5 @@ insert_exponent (double *significand_p,
>>double significand = *significand_p;
>>unsigned long long int exponent = *exponent_p;
>>  
>> -  return scalar_insert_exp (significand, exponent); /* { dg-error 
>> "'__builtin_vec_scalar_insert_exp' is not supported in this compiler 
>> configuration" } */
>> +  

Re: [PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-17 Thread Segher Boessenkool
Hi!

On Tue, Nov 16, 2021 at 02:26:22PM -0600, Bill Schmidt wrote:
> Hi!  I recently submitted [1] to make adjustments to test cases for the new 
> builtins
> support, mostly due to error messages changing for consistency.  Thanks for 
> the
> previous review.  I've reviewed the reasons for the changes and removed 
> unrelated
> changes as requested.

And the results are?  This is much easier to write up, and to review, if
you split the patch into pieces with one theme each.  If you do that
right then most reviews will be rubber-stamping, and some might require
some thought (and some may even get objections).  The way things are it
is a puzzle hunt to review this.

>  - For fold-vect-splat-floatdouble.c and fold-vec-splat-longlong.c, the 
> existing
>test cases have some bad tests in them (checking two bits when only one bit
>is meaningful).  The new builtin support catches this but the old support 
> did
>not.  Removing those bad cases changes some of the scan-assembler-times 
> expected
>values.

Do this is a separate patch then, independent of the series?  With this
explanation in the commit message.  This is pre-approved.

>  - For int_128bit-runnable.c, I chose not to do gimple folding on the 128-bit
>comparison operations in the new implementation, because doing so results 
> in
>bad code that splits things into two 64-bit values.  That needs separate
>attention; but the point here is, when I did that, I started generating
>more of the vcmpequq, vcmpgtsq, and vcmpgtuq instructions.

And you now get worse code (albeit in some cases no longer invalid)?


> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> @@ -14,7 +14,7 @@ get_exponent (double *p)
>  {
>double source = *p;
>  
> -  return scalar_extract_exp (source);/* { dg-error 
> "'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
> configuration" } */
> +  return scalar_extract_exp (source);/* { dg-error 
> "'__builtin_vsx_scalar_extract_exp' requires the" } */
>  }

The testcase uses __builtin_vec_scalar_extract_exp, so this is not okay.

> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
> @@ -12,5 +12,5 @@ get_significand (double *p)
>  {
>double source = *p;
>  
> -  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
> "'__builtin_vec_scalar_extract_sig' is not supported in this compiler 
> configuration" } */
> +  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
> "'__builtin_vsx_scalar_extract_sig' requires the" } */
>  }

This not either.

> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
> @@ -16,5 +16,5 @@ insert_exponent (unsigned long long int *significand_p,
>unsigned long long int significand = *significand_p;
>unsigned long long int exponent = *exponent_p;
>  
> -  return scalar_insert_exp (significand, exponent); /* { dg-error 
> "'__builtin_vec_scalar_insert_exp' is not supported in this compiler 
> configuration" } */
> +  return scalar_insert_exp (significand, exponent); /* { dg-error 
> "'__builtin_vsx_scalar_insert_exp' requires the" } */

Or this.

> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
> @@ -16,5 +16,5 @@ insert_exponent (double *significand_p,
>double significand = *significand_p;
>unsigned long long int exponent = *exponent_p;
>  
> -  return scalar_insert_exp (significand, exponent); /* { dg-error 
> "'__builtin_vec_scalar_insert_exp' is not supported in this compiler 
> configuration" } */
> +  return scalar_insert_exp (significand, exponent); /* { dg-error 
> "'__builtin_vsx_scalar_insert_exp_dp' requires the" } */
>  }

Etc.

It is not okay to blindly adjust the testcases to accept what the new
code does.  This is a regression.  It is okay to have it regressed for a
while.  It is also okay to xfail things, if there is no expectation it
can be fixed before the next release (or some other suitably big time
frame, this isn't an exact science).

> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c
> @@ -10,5 +10,5 @@ test_neg (float *p)
>  {
>float source = *p;
>  
> -  return __builtin_vec_scalar_test_neg_sp (source); /* { dg-error 
> "'__builtin_vsx_scalar_test_neg_sp' requires" } */
> +  return __builtin_vec_scalar_test_neg (source); /* { dg-error 
> "'__builtin_vsx_scalar_test_neg_sp' requires" } */
>  }

This one is very curious.  You change the test to use a more generic
builtin name, presumably because the (undocumented) more specific name
is no longer allowed, but the error message still uses that name?

> --- a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
> +++ 

[PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  I recently submitted [1] to make adjustments to test cases for the new 
builtins
support, mostly due to error messages changing for consistency.  Thanks for the
previous review.  I've reviewed the reasons for the changes and removed 
unrelated
changes as requested.

A couple of comments:

 - For fold-vect-splat-floatdouble.c and fold-vec-splat-longlong.c, the existing
   test cases have some bad tests in them (checking two bits when only one bit
   is meaningful).  The new builtin support catches this but the old support did
   not.  Removing those bad cases changes some of the scan-assembler-times 
expected
   values.
 - For int_128bit-runnable.c, I chose not to do gimple folding on the 128-bit
   comparison operations in the new implementation, because doing so results in
   bad code that splits things into two 64-bit values.  That needs separate
   attention; but the point here is, when I did that, I started generating
   more of the vcmpequq, vcmpgtsq, and vcmpgtuq instructions.

Everything else here is hopefully straightforward, and unchanged from the 
previous
submission.

Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu 
with
-m32 and -m64.  Is this okay for trunk?

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578615.html


2021-11-15  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error
message.
* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
* gcc.target/powerpc/byte-in-set-2.c: Likewise.
* gcc.target/powerpc/cmpb-2.c: Likewise.
* gcc.target/powerpc/cmpb32-2.c: Likewise.
* gcc.target/powerpc/crypto-builtin-2.c: Likewise.
* gcc.target/powerpc/fold-vec-splat-floatdouble.c: Remove invalid
test and adjust xxpermdi count.
* gcc.target/powerpc/fold-vec-splat-longlong.c: Remove invalid
tests and adjust instruction counts.
* gcc.target/powerpc/fold-vec-splat-misc-invalid.c: Adjust error
messages.
* gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
counts since we do better by not gimple-folding some builtins.
* gcc.target/powerpc/pr80315-1.c: Adjust error message.
* gcc.target/powerpc/pr80315-2.c: Likewise.
* gcc.target/powerpc/pr80315-3.c: Likewise.
* gcc.target/powerpc/pr80315-4.c: Likewise.
* gcc.target/powerpc/pr88100.c: Likewise.
* gcc.target/powerpc/pragma_misc9.c: Likewise.
* gcc.target/powerpc/pragma_power8.c: Undef _RS6000_VECDEFINES_H.
* gcc.target/powerpc/pragma_power9.c: Likewise.
* gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error
messages.
* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
* gcc.target/powerpc/vec-gnb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-all-nez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cmpnez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-xl-len-13.c: Likewise.
* gcc.target/powerpc/vsu/vec-xst-len-12.c: Likewise.
---
 .../gcc.target/powerpc/bfp/scalar-extract-exp-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-extract-sig-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-2.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-5.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-8.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-2.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-3.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-5.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c  |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c|  2 +-
 .../gcc.target/powerpc/crypto-builtin-2.c  | 14 +++---
 .../powerpc/fold-vec-splat-floatdouble.c   |  4 ++--
 .../gcc.target/powerpc/fold-vec-splat-longlong.c   | 10 +++---
 .../powerpc/fold-vec-splat-misc-invalid.c  |  8 
 .../gcc.target/powerpc/int_128bit-runnable.c   |  6 +++---
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 ++--