Re: [PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-05-22 Thread Kewen.Lin
Hi Carl,

on 2024/5/23 08:29, Carl Love wrote:
> Kewen:
> 
> On 5/13/24 22:44, Kewen.Lin wrote:
>>> perform the same operation as setting a specific element in the vector in
>>> C code.  For example:
>>>
>>>   src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
>>>   src_v4si[index] = int_val;
>>>
>>> The built-in actually generates more instructions than the inline C code
>>> with no optimization but is identical with -O3 optimizations.
>>>
>>> All of the above built-ins that are removed do not have test cases and
>>> are not documented.
>>>
>>> Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
>>> __builtin_vec_set_v2df are not removed as they are used in function
>>> resolve_vec_insert() in file rs6000-c.cc.
>> I think we can replace these calls with the equivalent gimple codes
>> (early expanding it) and then we can get rid of these instances.
> 
> Hmm, going to need a little coaching here.  I am not sure how to do this.  
> Looks like I get to lean some  something new.
> 

We have functions rs6000_gimple_fold.*_builtin to fold the builtins,
it's folding (expanding) the bif with equivalent gimple codes, what
we want here is similar, you can refer to some implementation there.
For the expected gimple code, you can refer to what's generated with
normal C code.  Feel free to let me know when you meet some issues
when you are trying, even you prefer me to follow up this.

BR,
Kewen


Re: [PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-05-22 Thread Carl Love
Kewen:

On 5/13/24 22:44, Kewen.Lin wrote:
>> perform the same operation as setting a specific element in the vector in
>> C code.  For example:
>>
>>   src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
>>   src_v4si[index] = int_val;
>>
>> The built-in actually generates more instructions than the inline C code
>> with no optimization but is identical with -O3 optimizations.
>>
>> All of the above built-ins that are removed do not have test cases and
>> are not documented.
>>
>> Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
>> __builtin_vec_set_v2df are not removed as they are used in function
>> resolve_vec_insert() in file rs6000-c.cc.
> I think we can replace these calls with the equivalent gimple codes
> (early expanding it) and then we can get rid of these instances.

Hmm, going to need a little coaching here.  I am not sure how to do this.  
Looks like I get to lean some  something new.

   Carl 


Re: [PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-05-13 Thread Kewen.Lin
Hi,

on 2024/4/20 05:18, Carl Love wrote:
> rs6000, remove vector set and vector init built-ins.
> 
> The vector init built-ins:
> 
>   __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
>   __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
>   __builtin_vec_init_v2di, __builtin_vec_init_v2df,
>   __builtin_vec_set_v1ti
> 
> perform the same operation as initializing the vector in C code.  For
> example:
> 
>   result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
>   result_v4si = {1, 2, 3, 4};
> 
> These two constructs were tested and verified they generate identical
> assembly instructions with no optimization and -O3 optimization.
> 
> The vector set built-ins:
> 
>   __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
>   __builtin_vec_set_v4si, __builtin_vec_set_v4sf
> 
> perform the same operation as setting a specific element in the vector in
> C code.  For example:
> 
>   src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
>   src_v4si[index] = int_val;
> 
> The built-in actually generates more instructions than the inline C code
> with no optimization but is identical with -O3 optimizations.
> 
> All of the above built-ins that are removed do not have test cases and
> are not documented.
> 
> Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
> __builtin_vec_set_v2df are not removed as they are used in function
> resolve_vec_insert() in file rs6000-c.cc.

I think we can replace these calls with the equivalent gimple codes
(early expanding it) and then we can get rid of these instances.

BR,
Kewen

> 
> The built-ins are removed as they don't provide any benefit over just
> using C code.
> 
> gcc/ChangeLog:
>   * config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
>__builtin_vec_init_v8hi, __builtin_vec_init_v4si,
>   __builtin_vec_init_v4sf, __builtin_vec_init_v2di,
>   __builtin_vec_init_v2df, __builtin_vec_set_v1ti,
>   __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
>   __builtin_vec_set_v4si, __builtin_vec_set_v4sf,
>   __builtin_vec_set_v2di, __builtin_vec_set_v2df,
>   __builtin_vec_set_v1ti): Remove built-in definitions.
> ---
>  gcc/config/rs6000/rs6000-builtins.def | 42 ++-
>  1 file changed, 2 insertions(+), 40 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 19d05b8043a..d04ad4ce7e5 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1115,37 +1115,6 @@
>const signed short __builtin_vec_ext_v8hi (vss, signed int);
>  VEC_EXT_V8HI nothing {extract}
>  
> -  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, 
> \
> -signed char, signed char, signed char, signed char, signed char, 
> \
> -signed char, signed char, signed char, signed char, signed char, 
> \
> -signed char, signed char, signed char);
> -VEC_INIT_V16QI nothing {init}
> -
> -  const vf __builtin_vec_init_v4sf (float, float, float, float);
> -VEC_INIT_V4SF nothing {init}
> -
> -  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
> - signed int);
> -VEC_INIT_V4SI nothing {init}
> -
> -  const vss __builtin_vec_init_v8hi (signed short, signed short, signed 
> short,\
> - signed short, signed short, signed short, signed short, \
> - signed short);
> -VEC_INIT_V8HI nothing {init}
> -
> -  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
> -VEC_SET_V16QI nothing {set}
> -
> -  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
> -VEC_SET_V4SF nothing {set}
> -
> -  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
> -VEC_SET_V4SI nothing {set}
> -
> -  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
> -VEC_SET_V8HI nothing {set}
> -
> -
>  ; Cell builtins.
>  [cell]
>pure vsc __builtin_altivec_lvlx (signed long, const void *);
> @@ -1292,15 +1261,8 @@
>const signed long long __builtin_vec_ext_v2di (vsll, signed int);
>  VEC_EXT_V2DI nothing {extract}
>  
> -  const vsq __builtin_vec_init_v1ti (signed __int128);
> -VEC_INIT_V1TI nothing {init}
> -
> -  const vd __builtin_vec_init_v2df (double, double);
> -VEC_INIT_V2DF nothing {init}
> -
> -  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
> -VEC_INIT_V2DI nothing {init}
> -
> +;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
> +;; resolve_vec_insert(), rs6000-c.cc
>const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
>  VEC_SET_V1TI nothing {set}
>  



[PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-04-19 Thread Carl Love
rs6000, remove vector set and vector init built-ins.

The vector init built-ins:

  __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
  __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
  __builtin_vec_init_v2di, __builtin_vec_init_v2df,
  __builtin_vec_set_v1ti

perform the same operation as initializing the vector in C code.  For
example:

  result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
  result_v4si = {1, 2, 3, 4};

These two constructs were tested and verified they generate identical
assembly instructions with no optimization and -O3 optimization.

The vector set built-ins:

  __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
  __builtin_vec_set_v4si, __builtin_vec_set_v4sf

perform the same operation as setting a specific element in the vector in
C code.  For example:

  src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
  src_v4si[index] = int_val;

The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.

All of the above built-ins that are removed do not have test cases and
are not documented.

Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.

The built-ins are removed as they don't provide any benefit over just
using C code.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
 __builtin_vec_init_v8hi, __builtin_vec_init_v4si,
__builtin_vec_init_v4sf, __builtin_vec_init_v2di,
__builtin_vec_init_v2df, __builtin_vec_set_v1ti,
__builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
__builtin_vec_set_v4si, __builtin_vec_set_v4sf,
__builtin_vec_set_v2di, __builtin_vec_set_v2df,
__builtin_vec_set_v1ti): Remove built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 42 ++-
 1 file changed, 2 insertions(+), 40 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 19d05b8043a..d04ad4ce7e5 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1115,37 +1115,6 @@
   const signed short __builtin_vec_ext_v8hi (vss, signed int);
 VEC_EXT_V8HI nothing {extract}
 
-  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
-signed char, signed char, signed char, signed char, signed char, \
-signed char, signed char, signed char, signed char, signed char, \
-signed char, signed char, signed char);
-VEC_INIT_V16QI nothing {init}
-
-  const vf __builtin_vec_init_v4sf (float, float, float, float);
-VEC_INIT_V4SF nothing {init}
-
-  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
- signed int);
-VEC_INIT_V4SI nothing {init}
-
-  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short,\
- signed short, signed short, signed short, signed short, \
- signed short);
-VEC_INIT_V8HI nothing {init}
-
-  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
-VEC_SET_V16QI nothing {set}
-
-  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
-VEC_SET_V4SF nothing {set}
-
-  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
-VEC_SET_V4SI nothing {set}
-
-  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
-VEC_SET_V8HI nothing {set}
-
-
 ; Cell builtins.
 [cell]
   pure vsc __builtin_altivec_lvlx (signed long, const void *);
@@ -1292,15 +1261,8 @@
   const signed long long __builtin_vec_ext_v2di (vsll, signed int);
 VEC_EXT_V2DI nothing {extract}
 
-  const vsq __builtin_vec_init_v1ti (signed __int128);
-VEC_INIT_V1TI nothing {init}
-
-  const vd __builtin_vec_init_v2df (double, double);
-VEC_INIT_V2DF nothing {init}
-
-  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
-VEC_INIT_V2DI nothing {init}
-
+;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
+;; resolve_vec_insert(), rs6000-c.cc
   const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
 VEC_SET_V1TI nothing {set}
 
-- 
2.44.0