On Sun, Oct 26, 2025 at 11:58 AM Li, Pan2 <[email protected]> wrote:
>
> Hi Richard,
>
> I had a try for 2 parts those days.
>
> 1. Remove outer convert?, it will make the pattern failed to match as the 
> type is signed
>      without that convert, and the types_match (type, @0, @1) to be false.
> 2.  Merge 2 into one pattern with (convert?@4 @0) but it will also result in
>      pattern match fail(when NT is uint32_t and WT is uint64_t).
>      Because the types_match (type, capture[0], capture[1]) will be false, as 
> type is SI and capture is DI.
>
> So I think it is possible to separate the common part into a helper pattern 
> like unsigned_sat_mul_helper.
> And make the real unsigned_integer_sat_mul(@0, @1) for the first one, and 
> unsigned_integer_sat_mul((convert @0), (convert @1)
> for the second one.
> Then we don't need to duplicate most of the patterns up to a point. How do 
> you think of it? Thanks a lot.

Maybe we can first get one of the patterns to a form I understand (the
one with the many conversions)?
Splitting out parts might only obfuscate things.  It would be maybe
nice to be able to merge different
alternatives within the same (match ..) sharing both the name and the
conditions, like

(match (foo @0 @1)
 (convert (mult @0 @1))
 (widen_mult @0 @1)
 (if (...)))

If you can structure the two patterns in a way that the (if ...) part
is identical that would help.

Some question:

+   (convert?
+    (bit_ior (negate (convert (gt @3 INTEGER_CST@2)))
+            (convert (mult_op:c@3 (convert@4 @0) (convert@5 @1)))))

So I get that we want a widening multiplication but that's either a widen_mult
already or not.  We might want to have a

(match
 (widening_mult @0 @1)
 (widen_mult @0 @1))
(match
 (widening_mult @0 @1)
 (mult (convert @0) (convert @1))
 (if (it is widening))

to help merging.  Now for the question - part of the pattern is

  (convert? (bit_ior (negate (convert from-bool)) (convert (widening_mult ...)))

I wonder whether you've seen the outer conversion being moved inside the IOR?
At least I wonder why there's the (convert ...) around the widening_mult and
why that's not conditional?  The textual description of the pattern mentions
(NT)x | (NT)overflow_p, so the result should already be in NT, no
outer conversion
required?  That said, (NT)((WT)x | (WT)overflow_p) would be equally valid,
but we fold that inside the IOR?  But not always?

Thanks,
Richard.

> Pan
>
> -----Original Message-----
> From: Li, Pan2
> Sent: Friday, October 24, 2025 9:52 AM
> To: 'Richard Biener' <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; 
> [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> Liu, Hongtao <[email protected]>; [email protected]
> Subject: RE: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
>
> Thanks Richard for comments.
>
> Assume we talk about the form 7 as below:
>
> #define DEF_SAT_U_MUL_FMT_7(NT, WT)             \
> NT __attribute__((noinline))                    \
> sat_u_mul_##NT##_from_##WT##_fmt_7 (NT a, NT b) \
> {                                               \
>   WT x = (WT)a * (WT)b;                         \
>   NT max = -1;                                  \
>   bool overflow_p = x > (WT)(max);              \
>   return -(NT)(overflow_p) | (NT)x;             \
> }
>
> > So why is matching the conversion necessary at all, that is, why is (mult 
> > @0 @1)
> > not sufficient here?
>
> Because there are many different types combinations, unsigned SAT_MUL need 
> the help of a wider type.
> For example:
>
> If NT is uint32_t, WT is uint128_t, we need the convert before widen_mul:
>   21   │   _12 = (unsigned long) a_6(D);
>   22   │   _13 = (unsigned long) b_7(D);
>   23   │   x_8 = _12 w* _13;
>
> If NT is uint64_t, WT is uint128_t, we don't have the convert.
>   19   │   x_8 = a_6(D) w* b_7(D);
>
> > You are not looking at the types of @0 or @1 at
> > all, AFAICS
> > one could be 'char' and one could be 'short'?
>
> The type is restricted by (if (types_match (type, @0, @1)). Aka the result 
> must have the same type as @0 and @1.
>
> > Captures on optional nodes are "merged" to the operand, so for (convert?@4 
> > @0)
> > when there is no conversion @4 == @0.  That is it behaves as if there were 
> > two
> > captures at the place of @0.
>
> I see, sounds like alias here, thanks for the explanation.
>
> > But isn't _4 | _5 still a sat_u_mul-8-u8 but with a conversion to the
> > desired type?
> > That is, I wonder if even when the conversion is missing, the
> > saturated operation
> > can be matched by using (original-type)SAT_UXYZ (...), implying the SAT_UXYZ
> > produces a uint8_t (in this case)?  Not sure I phrased this in an
> > understandable way ;)
>
> > I'll note that the outer conversion is poorly constrained given the 
> > conversion
> > around the multiplication/inside the negate is what determines the
> > semantics of it
> > and that's not constrained either?
>
> For NT is uint8_t and WT is uint128_t, we have the final convert from gimple. 
> The bigger types mentioned in above
> don't have such convert.  The _11 and _6 should be almost the same, maybe 
> nop_convert here is good enough if it failed
> to match that.  I will have a try and keep you posted.
>
>   14   │   signed char _6;
>   15   │   uint8_t _11;
>  ...
>   26   │   _3 = (signed char) overflow_p_10;
>   27   │   _4 = -_3;
>   28   │   _5 = (signed char) x_9;
>   29   │   _6 = _4 | _5;
>   30   │   _11 = (uint8_t) _6;
>   31   │   return _11;
>
> Pan
>
>
> -----Original Message-----
> From: Richard Biener <[email protected]>
> Sent: Thursday, October 23, 2025 7:08 PM
> To: Li, Pan2 <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; 
> [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> Liu, Hongtao <[email protected]>; [email protected]
> Subject: Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
>
> On Wed, Oct 22, 2025 at 2:21 PM Li, Pan2 <[email protected]> wrote:
> >
> > Thanks Richard for comments.
> >
> > > I'm a bit confused as to this for and the explicit widen_mult case in the 
> > > 2nd
> > > pattern below.  In fact, the patterns look similar enough to me and I 
> > > wonder
> > > whether the consumer can handle the outer conversion and the conversion
> > > of the leafs of the multiplication?  The conversion of the leafs doesn't 
> > > have
> > > any constraints, nor is it obvious why there's an outer conversion around 
> > > the
> > > IOR.
> >
> > The two pattern looks similar up to a point. I separate them into 2 mostly 
> > comes from
> > the inner convert about mult, aka (mult (convert@4 @0) (convert@5 @1)). 
> > Some diff
> > type combination need to convert while others are not.
> >
> > I tried to leverage something like (mult (convert?@4 @0) (convert?@5 @1)) 
> > to put
> > them together. But it may has additional sematics because add '?' after 
> > convert will
> > cover (mult (convert@4 @0) @1) which is not expected.
>
> So why is matching the conversion necessary at all, that is, why is (mult @0 
> @1)
> not sufficient here?  You are not looking at the types of @0 or @1 at
> all, AFAICS
> one could be 'char' and one could be 'short'?
>
> > Another problem about add '?' after convert is capture @4/@5, if there is 
> > no need
> > to convert, I am not sure how to take care of the @4 from the “with“” scope 
> > because it
> > is optional. The code acts on capture @4/@5 is not correct when there is no 
> > convert.
>
> Captures on optional nodes are "merged" to the operand, so for (convert?@4 @0)
> when there is no conversion @4 == @0.  That is it behaves as if there were two
> captures at the place of @0.
>
> > I may have another try to merge them into one, but is there any suggestion 
> > here?
> >
> > > outer conversion around the
> > > IOR.
> >
> > Some type(s) may need to do a convert to the final type, like 
> > sat_u_mul-8-u8-from-u32,
> > The seq may look like blow.
> >
> >   25   │   _6 = _4 | _5;
> >   26   │   _11 = (uint8_t) _6;
> >   27   │   return _11;
>
> But isn't _4 | _5 still a sat_u_mul-8-u8 but with a conversion to the
> desired type?
> That is, I wonder if even when the conversion is missing, the
> saturated operation
> can be matched by using (original-type)SAT_UXYZ (...), implying the SAT_UXYZ
> produces a uint8_t (in this case)?  Not sure I phrased this in an
> understandable way ;)
>
> I'll note that the outer conversion is poorly constrained given the conversion
> around the multiplication/inside the negate is what determines the
> semantics of it
> and that's not constrained either?
>
> Richard.
>
> > Pan
> >
> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: Wednesday, October 22, 2025 5:00 PM
> > To: Li, Pan2 <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected]; 
> > [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> > Liu, Hongtao <[email protected]>; [email protected]
> > Subject: Re: [PATCH v1 1/2] Match: Add unsigned SAT_MUL for form 7
> >
> > On Mon, Oct 20, 2025 at 3:10 PM <[email protected]> wrote:
> > >
> > > From: Pan Li <[email protected]>
> > >
> > > This patch would like to try to match the the unsigned
> > > SAT_MUL form 7, aka below:
> > >
> > >   #define DEF_SAT_U_MUL_FMT_7(NT, WT)             \
> > >   NT __attribute__((noinline))                    \
> > >   sat_u_mul_##NT##_from_##WT##_fmt_7 (NT a, NT b) \
> > >   {                                               \
> > >     WT x = (WT)a * (WT)b;                         \
> > >     NT max = -1;                                  \
> > >     bool overflow_p = x > (WT)(max);              \
> > >     return -(NT)(overflow_p) | (NT)x;             \
> > >   }
> > >
> > > while WT is uint128_t, uint64_t, uint32_t and uint16_t, and
> > > NT is uint64_t, uint32_t, uint16_t or uint8_t.
> > >
> > > gcc/ChangeLog:
> > >
> > >         * match.pd: Add pattern for SAT_MUL form 5 include
> > >         mul and widen_mul.
> > >
> > > Signed-off-by: Pan Li <[email protected]>
> > > ---
> > >  gcc/match.pd | 35 +++++++++++++++++++++++++++++++++++
> > >  1 file changed, 35 insertions(+)
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index bfc51e6579a..0f55a82d989 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -3749,6 +3749,41 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >        bool widen_mult_p = prec * 2 == widen_prec;
> > >       }
> > >       (if (c2_is_type_precision_p && widen_mult_p)))))
> > > + (for mult_op (mult widen_mult)
> >
> > I'm a bit confused as to this for and the explicit widen_mult case in the 
> > 2nd
> > pattern below.  In fact, the patterns look similar enough to me and I wonder
> > whether the consumer can handle the outer conversion and the conversion
> > of the leafs of the multiplication?  The conversion of the leafs doesn't 
> > have
> > any constraints, nor is it obvious why there's an outer conversion around 
> > the
> > IOR.
> >
> > > +  (match (unsigned_integer_sat_mul @0 @1)
> > > +   (convert?
> > > +    (bit_ior (negate (convert (gt @3 INTEGER_CST@2)))
> > > +            (convert (mult_op:c@3 (convert@4 @0) (convert@5 @1)))))
> > > +   (if (types_match (type, @0, @1))
> > > +    (with
> > > +     {
> > > +      unsigned prec = TYPE_PRECISION (type);
> > > +      unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3));
> > > +      unsigned cvt4_prec = TYPE_PRECISION (TREE_TYPE (@4));
> > > +      unsigned cvt5_prec = TYPE_PRECISION (TREE_TYPE (@5));
> > > +
> > > +      wide_int max = wi::mask (prec, false, widen_prec);
> > > +      bool c2_is_max_p = wi::eq_p (wi::to_wide (@2), max);
> > > +      bool widen_mult_p = mult_op == WIDEN_MULT_EXPR && cvt4_prec == 
> > > cvt5_prec
> > > +       && widen_prec == cvt5_prec * 2;
> > > +      bool mult_p = mult_op == MULT_EXPR && cvt4_prec == cvt5_prec
> > > +       && cvt4_prec == widen_prec && widen_prec > prec;
> > > +     }
> > > +     (if (c2_is_max_p && (mult_p || widen_mult_p)))))))
> > > + (match (unsigned_integer_sat_mul @0 @1)
> > > +  (bit_ior (negate (convert (gt @3 INTEGER_CST@2)))
> > > +          (convert (widen_mult:c@3 @0 @1)))
> > > +  (if (types_match (type, @0, @1))
> > > +   (with
> > > +    {
> > > +     unsigned prec = TYPE_PRECISION (type);
> > > +     unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3));
> > > +
> > > +     wide_int max = wi::mask (prec, false, widen_prec);
> > > +     bool c2_is_max_p = wi::eq_p (wi::to_wide (@2), max);
> > > +     bool widen_mult_p = prec * 2 == widen_prec;
> > > +    }
> > > +    (if (c2_is_max_p && widen_mult_p)))))
> > >  )
> > >
> > >  /* The boundary condition for case 10: IMM = 1:
> > > --
> > > 2.43.0
> > >

Reply via email to