> Since you are using ssmul here as well please drop the fixed point > limitation on that optab as well.
I see, that make sense to me. Thanks Richard and will commit with that change if backend is Ok. Pan -----Original Message----- From: Richard Biener <richard.guent...@gmail.com> Sent: Friday, July 4, 2025 5:58 PM To: Li, Pan2 <pan2...@intel.com> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp....@gmail.com; Chen, Ken <ken.c...@intel.com>; Liu, Hongtao <hongtao....@intel.com> Subject: Re: [PATCH v3 1/4] Internal-fn: Introduce new IFN_SAT_MUL for unsigned int On Wed, Jul 2, 2025 at 7:31 AM <pan2...@intel.com> wrote: > > From: Pan Li <pan2...@intel.com> > > This patch would like to add the middle-end presentation for the > unsigend saturation mul. Aka set the result of mul to the max > when overflow. > > Take uint8_t as example, we will have: > > * SAT_MUL (1, 127) => 127. > * SAT_MUL (2, 127) => 254. > * SAT_MUL (3, 127) => 255. > * SAT_MUL (255, 127) => 255. > > Given below example for uint16_t from uint128_t > > #define DEF_SAT_U_MUL_FMT_1(NT, WT) \ > NT __attribute__((noinline)) \ > sat_u_mul_##NT##_from_##WT##_fmt_1 (NT a, NT b) \ > { \ > WT x = (WT)a * (WT)b; \ > NT max = -1; \ > if (x > (WT)(max)) \ > return max; \ > else \ > return (NT)x; \ > } > > DEF_SAT_U_MUL_FMT_1(uint16_t, uint128_t) > > Before this patch: > 15 │ <bb 2> [local count: 1073741824]: > 16 │ _1 = (__int128 unsigned) a_4(D); > 17 │ _2 = (__int128 unsigned) b_5(D); > 18 │ _9 = (unsigned long) _1; > 19 │ _10 = (unsigned long) _2; > 20 │ x_6 = _9 w* _10; > 21 │ _7 = MIN_EXPR <x_6, 255>; > 22 │ _3 = (uint8_t) _7; > 23 │ return _3; > > After this patch: > 9 │ <bb 2> [local count: 1073741824]: > 10 │ _3 = .SAT_MUL (a_4(D), b_5(D)); [tail call] > 11 │ return _3; > > gcc/ChangeLog: > > * internal-fn.cc (commutative_binary_fn_p): Add new case > for SAT_MUL. > * internal-fn.def (SAT_MUL): Add new IFN_SAT_MUL. > * optabs.def (OPTAB_NL): Remove fixed point limitation. > > Signed-off-by: Pan Li <pan2...@intel.com> > --- > gcc/internal-fn.cc | 1 + > gcc/internal-fn.def | 1 + > gcc/optabs.def | 2 +- > 3 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 3f4ac937367..184f72132cc 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -4423,6 +4423,7 @@ commutative_binary_fn_p (internal_fn fn) > case IFN_ADD_OVERFLOW: > case IFN_MUL_OVERFLOW: > case IFN_SAT_ADD: > + case IFN_SAT_MUL: > case IFN_VEC_WIDEN_PLUS: > case IFN_VEC_WIDEN_PLUS_LO: > case IFN_VEC_WIDEN_PLUS_HI: > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index 8edfa3540f8..914ee9f278c 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -282,6 +282,7 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | > ECF_NOTHROW, first, > > DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, > binary) > DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_SUB, ECF_CONST, first, sssub, ussub, > binary) > +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_MUL, ECF_CONST, first, ssmul, usmul, > binary) Since you are using ssmul here as well please drop the fixed point limitation on that optab as well. OK with that change. Richard. > > DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_TRUNC, ECF_CONST, first, sstrunc, ustrunc, > unary_convert) > > diff --git a/gcc/optabs.def b/gcc/optabs.def > index 0c1435d4ecd..ea049378112 100644 > --- a/gcc/optabs.def > +++ b/gcc/optabs.def > @@ -135,7 +135,7 @@ OPTAB_NX(smul_optab, "mul$F$a3") > OPTAB_VL(smulv_optab, "mulv$I$a3", MULT, "mul", '3', gen_intv_fp_libfunc) > OPTAB_VX(smulv_optab, "mul$F$a3") > OPTAB_NL(ssmul_optab, "ssmul$Q$a3", SS_MULT, "ssmul", '3', > gen_signed_fixed_libfunc) > -OPTAB_NL(usmul_optab, "usmul$Q$a3", US_MULT, "usmul", '3', > gen_unsigned_fixed_libfunc) > +OPTAB_NL(usmul_optab, "usmul$a3", US_MULT, "usmul", '3', > gen_unsigned_fixed_libfunc) > OPTAB_NL(sdiv_optab, "div$a3", DIV, "div", '3', > gen_int_fp_signed_fixed_libfunc) > OPTAB_VL(sdivv_optab, "divv$I$a3", DIV, "divv", '3', gen_int_libfunc) > OPTAB_VX(sdivv_optab, "div$F$a3") > -- > 2.43.0 >