RE: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Sandiford via Gcc-patches
Sent: Thursday, August 24, 2023 6:34 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: Re: [PATCH V2] gimple_fold: Support 
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

Juzhe-Zhong  writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(TYPE)   
>  \
>   __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,   
>  \
> TYPE *__restrict a,  \
> TYPE *__restrict b, int n)   \
>   {   
>  \
> for (int i = 0; i < n; i++)   
>  \
>   dst[i] -= a[i] * b[i];   \
>   }
>
> #define TEST_ALL()
>  \
>   TEST_TYPE (float)   
>  \
>
> TEST_ALL ()
>
> Gimple IR for RVV:
>
> ...
> _39 = -vect__8.14_26;
> vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, 
> vect__4.8_34, vect__4.8_34, _46, 0);
> ...
>
> This is because this following piece of codes in tree-ssa-math-opts.cc:
>
>   if (len)
>   fma_stmt
> = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2,
>   addop, else_value, len, bias);
>   else if (cond)
>   fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
>  op2, addop, else_value);
>   else
>   fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
>   gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
>   gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
>  use_stmt));
>   gsi_replace (&gsi, fma_stmt, true);
>   /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
>regardless of where the negation occurs.  */
>   gimple *orig_stmt = gsi_stmt (gsi);
>   if (fold_stmt (&gsi, follow_all_ssa_edges))
>   {
> if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
>   gcc_unreachable ();
> update_stmt (gsi_stmt (gsi));
>   }
>
> 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA.
>
> This patch support STMT fold into:
>
> vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, 
> vect__4.8_34, { 0.0, ... }, _46, 0);
>
> Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments.
>
> Extend maximum num ops:
> -  static const unsigned int MAX_NUM_OPS = 5;
> +  static const unsigned int MAX_NUM_OPS = 7;
>
> Bootstrap and Regtest on X86 passed.
> Tested on aarch64 Qemu.
>
> Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend.
>
>
> gcc/ChangeLog:
>
> * genmatch.cc (decision_tree::gen): Support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
> * gimple-match-exports.cc (gimple_simplify): Ditto.
> (gimple_resimplify6): New function.
> (gimple_resimplify7): New function.
> (gimple_match_op::resimplify): Support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
> (convert_conditional_op): Ditto.
> (build_call_internal): Ditto.
> (try_conditional_simplification): Ditto.
> (gimple_extract): Ditto.
> * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto.
> * internal-fn.cc (CASE): Ditto.

OK, thanks.

Richard

>
> ---
>  gcc/genmatch.cc |   2 +-
>  gcc/gimple-match-exports.cc | 123 ++--
>  gcc/gimple-match.h  |  16 -
>  gcc/internal-fn.cc  |   7 +-
>  4 files changed, 138 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index f46d2e1520d..a1925a747a7 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -4052,7 +4052,7 @@ decision_tree::gen (vec  &files, bool gimple)
>  }
>fprintf (stderr, "removed %u duplicate tails\n", rcnt);
>  
> -  for (unsigned n = 1; n <=

Re: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-24 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong  writes:
> Hi, Richard and Richi.
>
> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.
>
> Consider this following case:
> #define TEST_TYPE(TYPE)   
>  \
>   __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,   
>  \
> TYPE *__restrict a,  \
> TYPE *__restrict b, int n)   \
>   {   
>  \
> for (int i = 0; i < n; i++)   
>  \
>   dst[i] -= a[i] * b[i];   \
>   }
>
> #define TEST_ALL()
>  \
>   TEST_TYPE (float)   
>  \
>
> TEST_ALL ()
>
> Gimple IR for RVV:
>
> ...
> _39 = -vect__8.14_26;
> vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, 
> vect__4.8_34, vect__4.8_34, _46, 0);
> ...
>
> This is because this following piece of codes in tree-ssa-math-opts.cc:
>
>   if (len)
>   fma_stmt
> = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2,
>   addop, else_value, len, bias);
>   else if (cond)
>   fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
>  op2, addop, else_value);
>   else
>   fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
>   gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
>   gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
>  use_stmt));
>   gsi_replace (&gsi, fma_stmt, true);
>   /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
>regardless of where the negation occurs.  */
>   gimple *orig_stmt = gsi_stmt (gsi);
>   if (fold_stmt (&gsi, follow_all_ssa_edges))
>   {
> if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
>   gcc_unreachable ();
> update_stmt (gsi_stmt (gsi));
>   }
>
> 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA.
>
> This patch support STMT fold into:
>
> vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, 
> vect__4.8_34, { 0.0, ... }, _46, 0);
>
> Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments.
>
> Extend maximum num ops:
> -  static const unsigned int MAX_NUM_OPS = 5;
> +  static const unsigned int MAX_NUM_OPS = 7;
>
> Bootstrap and Regtest on X86 passed.
> Tested on aarch64 Qemu.
>
> Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend.
>
>
> gcc/ChangeLog:
>
> * genmatch.cc (decision_tree::gen): Support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
> * gimple-match-exports.cc (gimple_simplify): Ditto.
> (gimple_resimplify6): New function.
> (gimple_resimplify7): New function.
> (gimple_match_op::resimplify): Support 
> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
> (convert_conditional_op): Ditto.
> (build_call_internal): Ditto.
> (try_conditional_simplification): Ditto.
> (gimple_extract): Ditto.
> * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto.
> * internal-fn.cc (CASE): Ditto.

OK, thanks.

Richard

>
> ---
>  gcc/genmatch.cc |   2 +-
>  gcc/gimple-match-exports.cc | 123 ++--
>  gcc/gimple-match.h  |  16 -
>  gcc/internal-fn.cc  |   7 +-
>  4 files changed, 138 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index f46d2e1520d..a1925a747a7 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -4052,7 +4052,7 @@ decision_tree::gen (vec  &files, bool gimple)
>  }
>fprintf (stderr, "removed %u duplicate tails\n", rcnt);
>  
> -  for (unsigned n = 1; n <= 5; ++n)
> +  for (unsigned n = 1; n <= 7; ++n)
>  {
>bool has_kids_p = false;
>  
> diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc
> index 7aeb4ddb152..b36027b0bad 100644
> --- a/gcc/gimple-match-exports.cc
> +++ b/gcc/gimple-match-exports.cc
> @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq 
> *, tree (*)(tree),
>code_helper, tree, tree, tree, tree, tree);
>  extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree),
>code_helper, tree, tree, tree, tree, tree, tree);
> +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree),
> +  code_helper, tre

[PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-21 Thread Juzhe-Zhong
Hi, Richard and Richi.

Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math.
It's supported in tree-ssa-math-opts.cc. However, GCC failed to support 
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS.

Consider this following case:
#define TEST_TYPE(TYPE)\
  __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,\
  TYPE *__restrict a,  \
  TYPE *__restrict b, int n)   \
  {\
for (int i = 0; i < n; i++)\
  dst[i] -= a[i] * b[i];   \
  }

#define TEST_ALL() \
  TEST_TYPE (float)\

TEST_ALL ()

Gimple IR for RVV:

...
_39 = -vect__8.14_26;
vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, vect__4.8_34, 
vect__4.8_34, _46, 0);
...

This is because this following piece of codes in tree-ssa-math-opts.cc:

  if (len)
fma_stmt
  = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2,
addop, else_value, len, bias);
  else if (cond)
fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1,
   op2, addop, else_value);
  else
fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop);
  gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt));
  gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun,
   use_stmt));
  gsi_replace (&gsi, fma_stmt, true);
  /* Follow all SSA edges so that we generate FMS, FNMA and FNMS
 regardless of where the negation occurs.  */
  gimple *orig_stmt = gsi_stmt (gsi);
  if (fold_stmt (&gsi, follow_all_ssa_edges))
{
  if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi)))
gcc_unreachable ();
  update_stmt (gsi_stmt (gsi));
}

'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA.

This patch support STMT fold into:

vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, 
vect__4.8_34, { 0.0, ... }, _46, 0);

Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments.

Extend maximum num ops:
-  static const unsigned int MAX_NUM_OPS = 5;
+  static const unsigned int MAX_NUM_OPS = 7;

Bootstrap and Regtest on X86 passed.
Tested on aarch64 Qemu.

Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend.


gcc/ChangeLog:

* genmatch.cc (decision_tree::gen): Support 
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
* gimple-match-exports.cc (gimple_simplify): Ditto.
(gimple_resimplify6): New function.
(gimple_resimplify7): New function.
(gimple_match_op::resimplify): Support 
COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold.
(convert_conditional_op): Ditto.
(build_call_internal): Ditto.
(try_conditional_simplification): Ditto.
(gimple_extract): Ditto.
* gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto.
* internal-fn.cc (CASE): Ditto.

---
 gcc/genmatch.cc |   2 +-
 gcc/gimple-match-exports.cc | 123 ++--
 gcc/gimple-match.h  |  16 -
 gcc/internal-fn.cc  |   7 +-
 4 files changed, 138 insertions(+), 10 deletions(-)

diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
index f46d2e1520d..a1925a747a7 100644
--- a/gcc/genmatch.cc
+++ b/gcc/genmatch.cc
@@ -4052,7 +4052,7 @@ decision_tree::gen (vec  &files, bool gimple)
 }
   fprintf (stderr, "removed %u duplicate tails\n", rcnt);
 
-  for (unsigned n = 1; n <= 5; ++n)
+  for (unsigned n = 1; n <= 7; ++n)
 {
   bool has_kids_p = false;
 
diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc
index 7aeb4ddb152..b36027b0bad 100644
--- a/gcc/gimple-match-exports.cc
+++ b/gcc/gimple-match-exports.cc
@@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq 
*, tree (*)(tree),
 code_helper, tree, tree, tree, tree, tree);
 extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree),
 code_helper, tree, tree, tree, tree, tree, tree);
+extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree),
+code_helper, tree, tree, tree, tree, tree, tree,
+tree);
+extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree),
+code_helper, tree, tree, tree, tree, tree, tree,
+