Re: [PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-08-02 Thread Segher Boessenkool
On Mon, Aug 02, 2021 at 08:31:43AM -0500, Bill Schmidt wrote:
> Interestingly, when the quadword compares are expanded at GIMPLE time, 
> we generate worse code involving individual 64-bit compares.  For the 
> time being, I will not expand these at GIMPLE time; independently, this 
> bears looking at to see why expressions like (uint128_1 < uint128_2) 
> will generate poor code.

Details like this should probably not be exposed before RTL anyway?
Everything else is at a more abstracted level as well before expand?

It will be interesting to see what causes the worse code though :-)


Segher


Re: [PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-08-02 Thread Bill Schmidt via Gcc-patches

Hi Will,

On 7/29/21 7:42 AM, Bill Schmidt wrote:

On 7/28/21 4:21 PM, will schmidt wrote:

On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:


+/* Vector compares; EQ, NE, GE, GT, LE.  */
+case RS6000_BIF_VCMPEQUB:
+case RS6000_BIF_VCMPEQUH:
+case RS6000_BIF_VCMPEQUW:
+case RS6000_BIF_VCMPEQUD:
+  fold_compare_helper (gsi, EQ_EXPR, stmt);
+  return true;
+
+case RS6000_BIF_VCMPNEB:
+case RS6000_BIF_VCMPNEH:
+case RS6000_BIF_VCMPNEW:
+  fold_compare_helper (gsi, NE_EXPR, stmt);
+  return true;
+
Noting that entries for  _CMPNET,_VCMPEQUT, etc are missing from this
version versus the non-new version of this function.
I believe thiswas/is deliberate and by design.
Same with entries for P10V_BUILTIN_CMPLE_1TI, etc below.


Indeed not!  This is something I missed when new code was added after I
posted the original patch series.  I'll reinstate the quadword
compares.  Thanks for spotting this!



Interestingly, when the quadword compares are expanded at GIMPLE time, 
we generate worse code involving individual 64-bit compares.  For the 
time being, I will not expand these at GIMPLE time; independently, this 
bears looking at to see why expressions like (uint128_1 < uint128_2) 
will generate poor code.


Bill



Bill



Re: [PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-07-29 Thread Bill Schmidt via Gcc-patches



On 7/28/21 4:21 PM, will schmidt wrote:

On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:


+/* Vector compares; EQ, NE, GE, GT, LE.  */
+case RS6000_BIF_VCMPEQUB:
+case RS6000_BIF_VCMPEQUH:
+case RS6000_BIF_VCMPEQUW:
+case RS6000_BIF_VCMPEQUD:
+  fold_compare_helper (gsi, EQ_EXPR, stmt);
+  return true;
+
+case RS6000_BIF_VCMPNEB:
+case RS6000_BIF_VCMPNEH:
+case RS6000_BIF_VCMPNEW:
+  fold_compare_helper (gsi, NE_EXPR, stmt);
+  return true;
+
Noting that entries for  _CMPNET,_VCMPEQUT, etc are missing from this
version versus the non-new version of this function.
I believe thiswas/is deliberate and by design.
Same with entries for P10V_BUILTIN_CMPLE_1TI, etc below.



Indeed not!  This is something I missed when new code was added after I 
posted the original patch series.  I'll reinstate the quadword 
compares.  Thanks for spotting this!


Bill





Re: [PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-07-28 Thread will schmidt via Gcc-patches
On Thu, 2021-06-17 at 10:19 -0500, Bill Schmidt via Gcc-patches wrote:


Hi,


> This is another patch that looks bigger than it really is.  Because we
> have a new namespace for the builtins, allowing us to have both the old
> and new builtin infrastructure supported at once, we need versions of
> these functions that use the new builtin namespace.  Otherwise the code is
> unchanged.

> 
> 2021-06-17  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
>   New forward decl.
>   (rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
>   (rs6000_new_builtin_valid_without_lhs): New function.
>   (rs6000_gimple_fold_new_mma_builtin): Likewise.
>   (rs6000_gimple_fold_new_builtin): Likewise.

ok

> ---
>  gcc/config/rs6000/rs6000-call.c | 1152 +++
>  1 file changed, 1152 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 269fddcdc7e..52df3d165e1 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, 
> machine_mode,
>  static void rs6000_common_init_builtins (void);
>  static void htm_init_builtins (void);
>  static void mma_init_builtins (void);
> +static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
> 
> 
>  /* Hash table to keep track of the argument types for builtin functions.  */
> @@ -11992,6 +11993,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
> *gsi)
>  bool
>  rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>  {
> +  if (new_builtins_are_live)
> +return rs6000_gimple_fold_new_builtin (gsi);
> +
>gimple *stmt = gsi_stmt (*gsi);
>tree fndecl = gimple_call_fndecl (stmt);
>gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == 
> BUILT_IN_MD);
> @@ -12939,6 +12943,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
>return false;
>  }
> 
> +/*  Helper function to sort out which built-ins may be valid without having
> +a LHS.  */
> +static bool
> +rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
> +   tree fndecl)
> +{
> +  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
> +return true;
> +
> +  switch (fn_code)
> +{
> +case RS6000_BIF_STVX_V16QI:
> +case RS6000_BIF_STVX_V8HI:
> +case RS6000_BIF_STVX_V4SI:
> +case RS6000_BIF_STVX_V4SF:
> +case RS6000_BIF_STVX_V2DI:
> +case RS6000_BIF_STVX_V2DF:
> +case RS6000_BIF_STXVW4X_V16QI:
> +case RS6000_BIF_STXVW4X_V8HI:
> +case RS6000_BIF_STXVW4X_V4SF:
> +case RS6000_BIF_STXVW4X_V4SI:
> +case RS6000_BIF_STXVD2X_V2DF:
> +case RS6000_BIF_STXVD2X_V2DI:
> +  return true;
> +default:
> +  return false;
> +}
> +}

ok

> +
>  /* Check whether a builtin function is supported in this target
> configuration.  */
>  bool
> @@ -13030,6 +13063,1125 @@ rs6000_new_builtin_is_supported_p (enum 
> rs6000_gen_builtins fncode)
>return true;
>  }
> 
> +/* Expand the MMA built-ins early, so that we can convert the 
> pass-by-reference
> +   __vector_quad arguments into pass-by-value arguments, leading to more
> +   efficient code generation.  */
> +static bool
> +rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
> + rs6000_gen_builtins fn_code)
> +{
> +  gimple *stmt = gsi_stmt (*gsi);
> +  size_t fncode = (size_t) fn_code;
> +
> +  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
> +return false;
> +
> +  /* Each call that can be gimple-expanded has an associated built-in
> + function that it will expand into.  If this one doesn't, we have
> + already expanded it!  */
> +  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
> +return false;
> +
> +  bifdata *bd = _builtin_info_x[fncode];
> +  unsigned nopnds = bd->nargs;
> +  gimple_seq new_seq = NULL;
> +  gimple *new_call;
> +  tree new_decl;
> +
> +  /* Compatibility built-ins; we used to call these
> + __builtin_mma_{dis,}assemble_pair, but now we call them
> + __builtin_vsx_{dis,}assemble_pair.  Handle the old verions.  */

versions.
(this snippet appears new to this version, so don't need to search for
an existing typo in current code. :-)

> +  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
> +fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
> +  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
> +fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
> +
> +  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
> +  || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
> +{
> +  /* This is an MMA disassemble built-in function.  */
> +  push_gimplify_context (true);
> +  unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
> +  tree dst_ptr = gimple_call_arg (stmt, 0);
> +  tree src_ptr = gimple_call_arg (stmt, 1);
> +  tree src_type = TREE_TYPE (src_ptr);
> +  tree src = 

[PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-06-17 Thread Bill Schmidt via Gcc-patches
This is another patch that looks bigger than it really is.  Because we
have a new namespace for the builtins, allowing us to have both the old
and new builtin infrastructure supported at once, we need versions of
these functions that use the new builtin namespace.  Otherwise the code is
unchanged.

2021-06-17  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
New forward decl.
(rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
(rs6000_new_builtin_valid_without_lhs): New function.
(rs6000_gimple_fold_new_mma_builtin): Likewise.
(rs6000_gimple_fold_new_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 1152 +++
 1 file changed, 1152 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 269fddcdc7e..52df3d165e1 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, 
machine_mode,
 static void rs6000_common_init_builtins (void);
 static void htm_init_builtins (void);
 static void mma_init_builtins (void);
+static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
 
 
 /* Hash table to keep track of the argument types for builtin functions.  */
@@ -11992,6 +11993,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
 bool
 rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 {
+  if (new_builtins_are_live)
+return rs6000_gimple_fold_new_builtin (gsi);
+
   gimple *stmt = gsi_stmt (*gsi);
   tree fndecl = gimple_call_fndecl (stmt);
   gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD);
@@ -12939,6 +12943,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/*  Helper function to sort out which built-ins may be valid without having
+a LHS.  */
+static bool
+rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
+ tree fndecl)
+{
+  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
+return true;
+
+  switch (fn_code)
+{
+case RS6000_BIF_STVX_V16QI:
+case RS6000_BIF_STVX_V8HI:
+case RS6000_BIF_STVX_V4SI:
+case RS6000_BIF_STVX_V4SF:
+case RS6000_BIF_STVX_V2DI:
+case RS6000_BIF_STVX_V2DF:
+case RS6000_BIF_STXVW4X_V16QI:
+case RS6000_BIF_STXVW4X_V8HI:
+case RS6000_BIF_STXVW4X_V4SF:
+case RS6000_BIF_STXVW4X_V4SI:
+case RS6000_BIF_STXVD2X_V2DF:
+case RS6000_BIF_STXVD2X_V2DI:
+  return true;
+default:
+  return false;
+}
+}
+
 /* Check whether a builtin function is supported in this target
configuration.  */
 bool
@@ -13030,6 +13063,1125 @@ rs6000_new_builtin_is_supported_p (enum 
rs6000_gen_builtins fncode)
   return true;
 }
 
+/* Expand the MMA built-ins early, so that we can convert the pass-by-reference
+   __vector_quad arguments into pass-by-value arguments, leading to more
+   efficient code generation.  */
+static bool
+rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
+   rs6000_gen_builtins fn_code)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  size_t fncode = (size_t) fn_code;
+
+  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
+return false;
+
+  /* Each call that can be gimple-expanded has an associated built-in
+ function that it will expand into.  If this one doesn't, we have
+ already expanded it!  */
+  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
+return false;
+
+  bifdata *bd = _builtin_info_x[fncode];
+  unsigned nopnds = bd->nargs;
+  gimple_seq new_seq = NULL;
+  gimple *new_call;
+  tree new_decl;
+
+  /* Compatibility built-ins; we used to call these
+ __builtin_mma_{dis,}assemble_pair, but now we call them
+ __builtin_vsx_{dis,}assemble_pair.  Handle the old verions.  */
+  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
+fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
+  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
+fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
+
+  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
+  || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
+{
+  /* This is an MMA disassemble built-in function.  */
+  push_gimplify_context (true);
+  unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
+  tree dst_ptr = gimple_call_arg (stmt, 0);
+  tree src_ptr = gimple_call_arg (stmt, 1);
+  tree src_type = TREE_TYPE (src_ptr);
+  tree src = create_tmp_reg_or_ssa_name (TREE_TYPE (src_type));
+  gimplify_assign (src, build_simple_mem_ref (src_ptr), _seq);
+
+  /* If we are not disassembling an accumulator/pair or our destination is
+another accumulator/pair, then just copy the entire thing as is.  */
+  if ((fncode == RS6000_BIF_DISASSEMBLE_ACC
+  && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node)
+ || (fncode == RS6000_BIF_DISASSEMBLE_PAIR_V

[PATCH 42/55] rs6000: Handle gimple folding of target built-ins

2021-06-08 Thread Bill Schmidt via Gcc-patches
This is another patch that looks bigger than it really is.  Because we
have a new namespace for the builtins, allowing us to have both the old
and new builtin infrastructure supported at once, we need versions of
these functions that use the new builtin namespace.  Otherwise the code is
unchanged.

2021-06-07  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
New forward decl.
(rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
(rs6000_new_builtin_valid_without_lhs): New function.
(rs6000_gimple_fold_new_mma_builtin): Likewise.
(rs6000_gimple_fold_new_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 1152 +++
 1 file changed, 1152 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 8f6b6b462f8..1bb9f1c255d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, 
machine_mode,
 static void rs6000_common_init_builtins (void);
 static void htm_init_builtins (void);
 static void mma_init_builtins (void);
+static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
 
 
 /* Hash table to keep track of the argument types for builtin functions.  */
@@ -11855,6 +11856,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi)
 bool
 rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 {
+  if (new_builtins_are_live)
+return rs6000_gimple_fold_new_builtin (gsi);
+
   gimple *stmt = gsi_stmt (*gsi);
   tree fndecl = gimple_call_fndecl (stmt);
   gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD);
@@ -12794,6 +12798,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/*  Helper function to sort out which built-ins may be valid without having
+a LHS.  */
+static bool
+rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
+ tree fndecl)
+{
+  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
+return true;
+
+  switch (fn_code)
+{
+case RS6000_BIF_STVX_V16QI:
+case RS6000_BIF_STVX_V8HI:
+case RS6000_BIF_STVX_V4SI:
+case RS6000_BIF_STVX_V4SF:
+case RS6000_BIF_STVX_V2DI:
+case RS6000_BIF_STVX_V2DF:
+case RS6000_BIF_STXVW4X_V16QI:
+case RS6000_BIF_STXVW4X_V8HI:
+case RS6000_BIF_STXVW4X_V4SF:
+case RS6000_BIF_STXVW4X_V4SI:
+case RS6000_BIF_STXVD2X_V2DF:
+case RS6000_BIF_STXVD2X_V2DI:
+  return true;
+default:
+  return false;
+}
+}
+
 /* Check whether a builtin function is supported in this target
configuration.  */
 bool
@@ -12885,6 +12918,1125 @@ rs6000_new_builtin_is_supported_p (enum 
rs6000_gen_builtins fncode)
   return true;
 }
 
+/* Expand the MMA built-ins early, so that we can convert the pass-by-reference
+   __vector_quad arguments into pass-by-value arguments, leading to more
+   efficient code generation.  */
+static bool
+rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
+   rs6000_gen_builtins fn_code)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  size_t fncode = (size_t) fn_code;
+
+  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
+return false;
+
+  /* Each call that can be gimple-expanded has an associated built-in
+ function that it will expand into.  If this one doesn't, we have
+ already expanded it!  */
+  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
+return false;
+
+  bifdata *bd = _builtin_info_x[fncode];
+  unsigned nopnds = bd->nargs;
+  gimple_seq new_seq = NULL;
+  gimple *new_call;
+  tree new_decl;
+
+  /* Compatibility built-ins; we used to call these
+ __builtin_mma_{dis,}assemble_pair, but now we call them
+ __builtin_vsx_{dis,}assemble_pair.  Handle the old verions.  */
+  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
+fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
+  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
+fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
+
+  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
+  || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
+{
+  /* This is an MMA disassemble built-in function.  */
+  push_gimplify_context (true);
+  unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
+  tree dst_ptr = gimple_call_arg (stmt, 0);
+  tree src_ptr = gimple_call_arg (stmt, 1);
+  tree src_type = TREE_TYPE (src_ptr);
+  tree src = make_ssa_name (TREE_TYPE (src_type));
+  gimplify_assign (src, build_simple_mem_ref (src_ptr), _seq);
+
+  /* If we are not disassembling an accumulator/pair or our destination is
+another accumulator/pair, then just copy the entire thing as is.  */
+  if ((fncode == RS6000_BIF_DISASSEMBLE_ACC
+  && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node)
+ || (fncode == RS6000_BIF_DISASSEMBLE_PAIR_V
+