Re: [PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-05-24 Thread Carl Love
Kewen:

On 5/21/24 20:05, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2024/5/22 08:13, Carl Love wrote:
>> Kewen:



>>> Why did you place this in a section for ISA 3.1 (Power10)?  It doesn't 
>>> really
>>> require this support.  The used instance VSEL_1TI and VSEL_1TI_UNS are 
>>> placed
>>> in altivec stanza, so it looks that we should put it under the section
>>> "PowerPC AltiVec Built-in Functions on ISA 2.05".  And since it's an 
>>> extension
>>> of @code{vec_sel} documented in the PVIPR, I prefer to just mention it's "an
>>> extension of the @code{vec_sel} built-in documented in the PVIPR" and 
>>> omitting
>>> the description to avoid possible slightly different wording.
>>
>> Honestly, at this point in time I don't remember why I put it there.  It has 
>> been too long since I created the patch.  That said, the test case requires 
>> Power 10 do to the comparison check using built-in vec_all_eq but that is 
>> another issue.  
>> The built-in generates the xxsel instruction that is an ISA 2.06 
>> instruction.  So, I would say it should to into the ISA 2.06 section.  I 
>> moved it to the ISA 2.06 section.
> 
> But the underlying implementation is:
> 
>   const vsq __builtin_altivec_vsel_1ti (vsq, vsq, vuq);
> VSEL_1TI vector_select_v1ti {}
> 
>   const vuq __builtin_altivec_vsel_1ti_uns (vuq, vuq, vuq);
> VSEL_1TI_UNS vector_select_v1ti_uns {}
> 
> , it's under altivec stanza and can result with insn vsel (so not xxsel),
> vsel is ISA 2.03, so I think ISA 2.05 better matches the implementation.

OK, moved to ISA 2.05

> 



>>
>> Sounds like there was some issue that you noticed on 
>> r14-10011-g6e62ede7aaccc6.  The new version of
>> print_i128 should be functionally equivalent but perhaps is "safer"?
> 
> Thanks for checking!  Looking into this more closely, I realized you didn't 
> apply the previously
> adopted way for printing (the way used in 
> gcc.target/powerpc/builtins-6-p9-runnable.c), sorry for
> the false alarm!  So your supposed print_i128 is fine to me.

OK, no problem.  Will go with the original print_i128 function.

Carl 


Re: [PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-05-21 Thread Kewen.Lin
Hi Carl,

on 2024/5/22 08:13, Carl Love wrote:
> Kewen:
> 
> On 5/13/24 19:54, Kewen.Lin wrote:
>> Hi,
>>
>> on 2024/4/20 05:17, Carl Love wrote:
>>> rs6000, add overloaded vec_sel with int128 arguments
>>>
>>> Extend the vec_sel built-in to take three signed/unsigned int128 arguments
>>> and return a signed/unsigned int128 result.
>>>
>>> Extending the vec_sel built-in makes the existing buit-ins
>>> __builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
>>> patch removes these built-ins.
>>>
>>> The patch adds documentation and test cases for the new overloaded vec_sel
>>> built-ins.
>>>
>>> gcc/ChangeLog:
>>> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
>>> __builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
>>> * config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
>>> definitions.
>>> * doc/extend.texi: Add documentation for new vec_sel arguments.
>>>
>>> gcc/testsuite/ChangeLog:
>>> * gcc.target/powerpc/vec_sel_runnable-int128.c: New test file.
>>> ---
>>>  gcc/config/rs6000/rs6000-builtins.def |  6 --
>>>  gcc/config/rs6000/rs6000-overload.def |  4 +
>>>  gcc/doc/extend.texi   | 14 
>>>  .../powerpc/vec-sel-runnable-i128.c   | 84 +++
>>>  4 files changed, 102 insertions(+), 6 deletions(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index d09e21a9151..46d2ae7b7cb 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -1931,12 +1931,6 @@
>>>const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
>>>  XXSEL_16QI_UNS vector_select_v16qi_uns {}
>>>  
>>> -  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
>>> -XXSEL_1TI vector_select_v1ti {}
>>> -
>>> -  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
>>> -XXSEL_1TI_UNS vector_select_v1ti_uns {}
>>> -
>>>const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
>>>  XXSEL_2DF vector_select_v2df {}
>>>  
>>> diff --git a/gcc/config/rs6000/rs6000-overload.def 
>>> b/gcc/config/rs6000/rs6000-overload.def
>>> index 68501c05289..5912c9452f4 100644
>>> --- a/gcc/config/rs6000/rs6000-overload.def
>>> +++ b/gcc/config/rs6000/rs6000-overload.def
>>> @@ -3274,6 +3274,10 @@
>>>  VSEL_2DF  VSEL_2DF_B
>>>vd __builtin_vec_sel (vd, vd, vull);
>>>  VSEL_2DF  VSEL_2DF_U
>>> +  vsq __builtin_vec_sel (vsq, vsq, vsq);
>>> +VSEL_1TI  VSEL_1TI_S
>>> +  vuq __builtin_vec_sel (vuq, vuq, vuq);
>>> +VSEL_1TI_UNS  VSEL_1TI_U
>>>  ; The following variants are deprecated.
>>>vsll __builtin_vec_sel (vsll, vsll, vsll);
>>>  VSEL_2DI_B  VSEL_2DI_S
>>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>>> index 64a43b55e2d..86b8e536dbe 100644
>>> --- a/gcc/doc/extend.texi
>>> +++ b/gcc/doc/extend.texi
>>> @@ -23358,6 +23358,20 @@ The programmer is responsible for understanding 
>>> the endianness issues involved
>>>  with the first argument and the result.
>>>  @findex vec_replace_unaligned
>>>  
>>> +Vector select
>>> +
>>> +@smallexample
>>> +vector signed __int128 vec_sel (vector signed __int128,
>>> +   vector signed __int128, vector signed __int128);
>>> +vector unsigned __int128 vec_sel (vector unsigned __int128,
>>> +   vector unsigned __int128, vector unsigned __int128);
>>> +@end smallexample
>>> +
>>> +The overloaded built-in @code{vec_sel} with vector signed/unsigned __int128
>>> +arguments and returns a vector selecting bits from the two source vectors 
>>> based
>>> +on the values of the third input vector.  This built-in is an extension of 
>>> the
>>> +@code{vec_sel} built-in documented in the PVIPR.
>>> +
>>
>> Why did you place this in a section for ISA 3.1 (Power10)?  It doesn't really
>> require this support.  The used instance VSEL_1TI and VSEL_1TI_UNS are placed
>> in altivec stanza, so it looks that we should put it under the section
>> "PowerPC AltiVec Built-in Functions on ISA 2.05".  And since it's an 
>> extension
>> of @code{vec_sel} documented in the PVIPR, I prefer to just mention it's "an
>> extension of the @code{vec_sel} built-in documented in the PVIPR" and 
>> omitting
>> the description to avoid possible slightly different wording.
> 
> Honestly, at this point in time I don't remember why I put it there.  It has 
> been too long since I created the patch.  That said, the test case requires 
> Power 10 do to the comparison check using built-in vec_all_eq but that is 
> another issue.  
> The built-in generates the xxsel instruction that is an ISA 2.06 instruction. 
>  So, I would say it should to into the ISA 2.06 section.  I moved it to the 
> ISA 2.06 section.

But the underlying implementation is:

  const vsq __builtin_altivec_vsel_1ti (vsq, vsq, vuq);
VSEL_1TI vector_select_v1ti {}

  const vuq __builtin_altivec_vsel_1ti_uns 

Re: [PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-05-21 Thread Carl Love
Kewen:

On 5/13/24 19:54, Kewen.Lin wrote:
> Hi,
> 
> on 2024/4/20 05:17, Carl Love wrote:
>> rs6000, add overloaded vec_sel with int128 arguments
>>
>> Extend the vec_sel built-in to take three signed/unsigned int128 arguments
>> and return a signed/unsigned int128 result.
>>
>> Extending the vec_sel built-in makes the existing buit-ins
>> __builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
>> patch removes these built-ins.
>>
>> The patch adds documentation and test cases for the new overloaded vec_sel
>> built-ins.
>>
>> gcc/ChangeLog:
>>  * config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
>>  __builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
>>  * config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
>>  definitions.
>>  * doc/extend.texi: Add documentation for new vec_sel arguments.
>>
>> gcc/testsuite/ChangeLog:
>>  * gcc.target/powerpc/vec_sel_runnable-int128.c: New test file.
>> ---
>>  gcc/config/rs6000/rs6000-builtins.def |  6 --
>>  gcc/config/rs6000/rs6000-overload.def |  4 +
>>  gcc/doc/extend.texi   | 14 
>>  .../powerpc/vec-sel-runnable-i128.c   | 84 +++
>>  4 files changed, 102 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
>> b/gcc/config/rs6000/rs6000-builtins.def
>> index d09e21a9151..46d2ae7b7cb 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1931,12 +1931,6 @@
>>const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
>>  XXSEL_16QI_UNS vector_select_v16qi_uns {}
>>  
>> -  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
>> -XXSEL_1TI vector_select_v1ti {}
>> -
>> -  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
>> -XXSEL_1TI_UNS vector_select_v1ti_uns {}
>> -
>>const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
>>  XXSEL_2DF vector_select_v2df {}
>>  
>> diff --git a/gcc/config/rs6000/rs6000-overload.def 
>> b/gcc/config/rs6000/rs6000-overload.def
>> index 68501c05289..5912c9452f4 100644
>> --- a/gcc/config/rs6000/rs6000-overload.def
>> +++ b/gcc/config/rs6000/rs6000-overload.def
>> @@ -3274,6 +3274,10 @@
>>  VSEL_2DF  VSEL_2DF_B
>>vd __builtin_vec_sel (vd, vd, vull);
>>  VSEL_2DF  VSEL_2DF_U
>> +  vsq __builtin_vec_sel (vsq, vsq, vsq);
>> +VSEL_1TI  VSEL_1TI_S
>> +  vuq __builtin_vec_sel (vuq, vuq, vuq);
>> +VSEL_1TI_UNS  VSEL_1TI_U
>>  ; The following variants are deprecated.
>>vsll __builtin_vec_sel (vsll, vsll, vsll);
>>  VSEL_2DI_B  VSEL_2DI_S
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 64a43b55e2d..86b8e536dbe 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -23358,6 +23358,20 @@ The programmer is responsible for understanding the 
>> endianness issues involved
>>  with the first argument and the result.
>>  @findex vec_replace_unaligned
>>  
>> +Vector select
>> +
>> +@smallexample
>> +vector signed __int128 vec_sel (vector signed __int128,
>> +   vector signed __int128, vector signed __int128);
>> +vector unsigned __int128 vec_sel (vector unsigned __int128,
>> +   vector unsigned __int128, vector unsigned __int128);
>> +@end smallexample
>> +
>> +The overloaded built-in @code{vec_sel} with vector signed/unsigned __int128
>> +arguments and returns a vector selecting bits from the two source vectors 
>> based
>> +on the values of the third input vector.  This built-in is an extension of 
>> the
>> +@code{vec_sel} built-in documented in the PVIPR.
>> +
> 
> Why did you place this in a section for ISA 3.1 (Power10)?  It doesn't really
> require this support.  The used instance VSEL_1TI and VSEL_1TI_UNS are placed
> in altivec stanza, so it looks that we should put it under the section
> "PowerPC AltiVec Built-in Functions on ISA 2.05".  And since it's an extension
> of @code{vec_sel} documented in the PVIPR, I prefer to just mention it's "an
> extension of the @code{vec_sel} built-in documented in the PVIPR" and omitting
> the description to avoid possible slightly different wording.

Honestly, at this point in time I don't remember why I put it there.  It has 
been too long since I created the patch.  That said, the test case requires 
Power 10 do to the comparison check using built-in vec_all_eq but that is 
another issue.  The built-in generates the xxsel instruction that is an ISA 
2.06 instruction.  So, I would say it should to into the ISA 2.06 section.  I 
moved it to the ISA 2.06 section.

For consistency with the previous patches/feedback, the descriptions are being 
dropped and replaced with the instance being a new extension of the built-in 
that is documented in the PVIPR.
> 
>>  Vector Shift Left Double Bit Immediate
>>  @smallexample
>>  @exdent vector signed char vec_sldb (vector signed char, vector signed char,
>> diff 

Re: [PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-05-13 Thread Kewen.Lin
Hi,

on 2024/4/20 05:17, Carl Love wrote:
> rs6000, add overloaded vec_sel with int128 arguments
> 
> Extend the vec_sel built-in to take three signed/unsigned int128 arguments
> and return a signed/unsigned int128 result.
> 
> Extending the vec_sel built-in makes the existing buit-ins
> __builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
> patch removes these built-ins.
> 
> The patch adds documentation and test cases for the new overloaded vec_sel
> built-ins.
> 
> gcc/ChangeLog:
>   * config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
>   __builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
>   * config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
>   definitions.
>   * doc/extend.texi: Add documentation for new vec_sel arguments.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/vec_sel_runnable-int128.c: New test file.
> ---
>  gcc/config/rs6000/rs6000-builtins.def |  6 --
>  gcc/config/rs6000/rs6000-overload.def |  4 +
>  gcc/doc/extend.texi   | 14 
>  .../powerpc/vec-sel-runnable-i128.c   | 84 +++
>  4 files changed, 102 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index d09e21a9151..46d2ae7b7cb 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1931,12 +1931,6 @@
>const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
>  XXSEL_16QI_UNS vector_select_v16qi_uns {}
>  
> -  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
> -XXSEL_1TI vector_select_v1ti {}
> -
> -  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
> -XXSEL_1TI_UNS vector_select_v1ti_uns {}
> -
>const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
>  XXSEL_2DF vector_select_v2df {}
>  
> diff --git a/gcc/config/rs6000/rs6000-overload.def 
> b/gcc/config/rs6000/rs6000-overload.def
> index 68501c05289..5912c9452f4 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -3274,6 +3274,10 @@
>  VSEL_2DF  VSEL_2DF_B
>vd __builtin_vec_sel (vd, vd, vull);
>  VSEL_2DF  VSEL_2DF_U
> +  vsq __builtin_vec_sel (vsq, vsq, vsq);
> +VSEL_1TI  VSEL_1TI_S
> +  vuq __builtin_vec_sel (vuq, vuq, vuq);
> +VSEL_1TI_UNS  VSEL_1TI_U
>  ; The following variants are deprecated.
>vsll __builtin_vec_sel (vsll, vsll, vsll);
>  VSEL_2DI_B  VSEL_2DI_S
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 64a43b55e2d..86b8e536dbe 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -23358,6 +23358,20 @@ The programmer is responsible for understanding the 
> endianness issues involved
>  with the first argument and the result.
>  @findex vec_replace_unaligned
>  
> +Vector select
> +
> +@smallexample
> +vector signed __int128 vec_sel (vector signed __int128,
> +   vector signed __int128, vector signed __int128);
> +vector unsigned __int128 vec_sel (vector unsigned __int128,
> +   vector unsigned __int128, vector unsigned __int128);
> +@end smallexample
> +
> +The overloaded built-in @code{vec_sel} with vector signed/unsigned __int128
> +arguments and returns a vector selecting bits from the two source vectors 
> based
> +on the values of the third input vector.  This built-in is an extension of 
> the
> +@code{vec_sel} built-in documented in the PVIPR.
> +

Why did you place this in a section for ISA 3.1 (Power10)?  It doesn't really
require this support.  The used instance VSEL_1TI and VSEL_1TI_UNS are placed
in altivec stanza, so it looks that we should put it under the section
"PowerPC AltiVec Built-in Functions on ISA 2.05".  And since it's an extension
of @code{vec_sel} documented in the PVIPR, I prefer to just mention it's "an
extension of the @code{vec_sel} built-in documented in the PVIPR" and omitting
the description to avoid possible slightly different wording.

>  Vector Shift Left Double Bit Immediate
>  @smallexample
>  @exdent vector signed char vec_sldb (vector signed char, vector signed char,
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c 
> b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
> new file mode 100644
> index 000..58eb383e8c3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
> @@ -0,0 +1,84 @@
> +/* { dg-do run  { target power10_hw }} */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-require-effective-target power10_hw } */

As mentioned above, this doesn't require power10, you can specify vmx_hw.
(btw removing { target power10_hw } on dg-do run line).

> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */

s/-mdejagnu-cpu=power10/-maltivec/
s/-save-temps//

> +
> +
> +#include 
> +
> +
> +#define DEBUG 0
> +
> +#if DEBUG
> +#include 
> +void