RE: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

2024-06-18 Thread Li, Pan2
Thanks Richard for comments.

> we might want to consider such transform in match.pd, in this case this
> would allow to elide one of the patterns.

That makes much more sense to me, it is not good idea to have many patterns for 
SAT_ADD,
will commit this first and have a try in another PATCH for this.

Pan

-Original Message-
From: Richard Biener  
Sent: Tuesday, June 18, 2024 7:03 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
jeffreya...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

On Mon, Jun 17, 2024 at 3:41 AM  wrote:
>
> From: Pan Li 
>
> When investigate the vectorization of .SAT_ADD,  we notice there
> are additional 2 forms,  aka form 7 and 8 for .SAT_ADD.
>
> Form 7:
>   #define DEF_SAT_U_ADD_FMT_7(T)  \
>   T __attribute__((noinline)) \
>   sat_u_add_##T##_fmt_7 (T x, T y)\
>   {   \
> return x > (T)(x + y) ? -1 : (x + y); \
>   }
>
> Form 8:
>   #define DEF_SAT_U_ADD_FMT_8(T)   \
>   T __attribute__((noinline))  \
>   sat_u_add_##T##_fmt_8 (T x, T y) \
>   {\
> return x <= (T)(x + y) ? (x + y) : -1; \
>   }
>
> Thus,  add above 2 forms to the match gimple_unsigned_integer_sat_add,
> and then the vectorizer can try to recog the pattern like form 7 and
> form 8.
>
> The below test suites are passed for this patch:
> 1. The rv64gcv fully regression test with newlib.
> 2. The rv64gcv build with glibc.
> 3. The x86 bootstrap test.
> 4. The x86 fully regression test.

OK.

Note that fold-const.cc has canonicalization for the minus one to be put last:

  /* If the second operand is simpler than the third, swap them
 since that produces better jump optimization results.  */
  if (truth_value_p (TREE_CODE (arg0))
  && tree_swap_operands_p (op1, op2))
{
  location_t loc0 = expr_location_or (arg0, loc);
  /* See if this can be inverted.  If it can't, possibly because
 it was a floating-point inequality comparison, don't do
 anything.  */
  tem = fold_invert_truthvalue (loc0, arg0);
  if (tem)
return fold_build3_loc (loc, code, type, tem, op2, op1);

we might want to consider such transform in match.pd, in this case this
would allow to elide one of the patterns.

Richard.

> gcc/ChangeLog:
>
> * match.pd: Add form 7 and 8 for the unsigned .SAT_ADD match.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 99968d316ed..aae6d30a5e4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3144,6 +3144,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
>integer_minus_onep (usadd_left_part_2 @0 @1)))
>
> +/* Unsigned saturation add, case 7 (branch with le):
> +   SAT_ADD = x <= (X + Y) ? (X + Y) : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep))
> +
> +/* Unsigned saturation add, case 8 (branch with gt):
> +   SAT_ADD = x > (X + Y) ? -1 : (X + Y).  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
> +
>  /* Unsigned saturation sub, case 1 (branch with gt):
> SAT_U_SUB = X > Y ? X - Y : 0  */
>  (match (unsigned_integer_sat_sub @0 @1)
> --
> 2.34.1
>


Re: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

2024-06-18 Thread Richard Biener
On Mon, Jun 17, 2024 at 3:41 AM  wrote:
>
> From: Pan Li 
>
> When investigate the vectorization of .SAT_ADD,  we notice there
> are additional 2 forms,  aka form 7 and 8 for .SAT_ADD.
>
> Form 7:
>   #define DEF_SAT_U_ADD_FMT_7(T)  \
>   T __attribute__((noinline)) \
>   sat_u_add_##T##_fmt_7 (T x, T y)\
>   {   \
> return x > (T)(x + y) ? -1 : (x + y); \
>   }
>
> Form 8:
>   #define DEF_SAT_U_ADD_FMT_8(T)   \
>   T __attribute__((noinline))  \
>   sat_u_add_##T##_fmt_8 (T x, T y) \
>   {\
> return x <= (T)(x + y) ? (x + y) : -1; \
>   }
>
> Thus,  add above 2 forms to the match gimple_unsigned_integer_sat_add,
> and then the vectorizer can try to recog the pattern like form 7 and
> form 8.
>
> The below test suites are passed for this patch:
> 1. The rv64gcv fully regression test with newlib.
> 2. The rv64gcv build with glibc.
> 3. The x86 bootstrap test.
> 4. The x86 fully regression test.

OK.

Note that fold-const.cc has canonicalization for the minus one to be put last:

  /* If the second operand is simpler than the third, swap them
 since that produces better jump optimization results.  */
  if (truth_value_p (TREE_CODE (arg0))
  && tree_swap_operands_p (op1, op2))
{
  location_t loc0 = expr_location_or (arg0, loc);
  /* See if this can be inverted.  If it can't, possibly because
 it was a floating-point inequality comparison, don't do
 anything.  */
  tem = fold_invert_truthvalue (loc0, arg0);
  if (tem)
return fold_build3_loc (loc, code, type, tem, op2, op1);

we might want to consider such transform in match.pd, in this case this
would allow to elide one of the patterns.

Richard.

> gcc/ChangeLog:
>
> * match.pd: Add form 7 and 8 for the unsigned .SAT_ADD match.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 99968d316ed..aae6d30a5e4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3144,6 +3144,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
>integer_minus_onep (usadd_left_part_2 @0 @1)))
>
> +/* Unsigned saturation add, case 7 (branch with le):
> +   SAT_ADD = x <= (X + Y) ? (X + Y) : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep))
> +
> +/* Unsigned saturation add, case 8 (branch with gt):
> +   SAT_ADD = x > (X + Y) ? -1 : (X + Y).  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
> +
>  /* Unsigned saturation sub, case 1 (branch with gt):
> SAT_U_SUB = X > Y ? X - Y : 0  */
>  (match (unsigned_integer_sat_sub @0 @1)
> --
> 2.34.1
>


[PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

2024-06-16 Thread pan2 . li
From: Pan Li 

When investigate the vectorization of .SAT_ADD,  we notice there
are additional 2 forms,  aka form 7 and 8 for .SAT_ADD.

Form 7:
  #define DEF_SAT_U_ADD_FMT_7(T)  \
  T __attribute__((noinline)) \
  sat_u_add_##T##_fmt_7 (T x, T y)\
  {   \
return x > (T)(x + y) ? -1 : (x + y); \
  }

Form 8:
  #define DEF_SAT_U_ADD_FMT_8(T)   \
  T __attribute__((noinline))  \
  sat_u_add_##T##_fmt_8 (T x, T y) \
  {\
return x <= (T)(x + y) ? (x + y) : -1; \
  }

Thus,  add above 2 forms to the match gimple_unsigned_integer_sat_add,
and then the vectorizer can try to recog the pattern like form 7 and
form 8.

The below test suites are passed for this patch:
1. The rv64gcv fully regression test with newlib.
2. The rv64gcv build with glibc.
3. The x86 bootstrap test.
4. The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add form 7 and 8 for the unsigned .SAT_ADD match.

Signed-off-by: Pan Li 
---
 gcc/match.pd | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 99968d316ed..aae6d30a5e4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3144,6 +3144,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
   integer_minus_onep (usadd_left_part_2 @0 @1)))
 
+/* Unsigned saturation add, case 7 (branch with le):
+   SAT_ADD = x <= (X + Y) ? (X + Y) : -1.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep))
+
+/* Unsigned saturation add, case 8 (branch with gt):
+   SAT_ADD = x > (X + Y) ? -1 : (X + Y).  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
+
 /* Unsigned saturation sub, case 1 (branch with gt):
SAT_U_SUB = X > Y ? X - Y : 0  */
 (match (unsigned_integer_sat_sub @0 @1)
-- 
2.34.1