Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, July 12, 2021 11:26 AM
>> To: Tamar Christina 
>> Cc: Richard Biener ; nd ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
>> where the sign for the multiplicant changes.
>> 
>> Tamar Christina  writes:
>> >> -Original Message-
>> >> From: Richard Sandiford 
>> >> Sent: Monday, July 12, 2021 10:39 AM
>> >> To: Tamar Christina 
>> >> Cc: Richard Biener ; nd ; gcc-
>> >> patc...@gcc.gnu.org
>> >> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
>> >> where the sign for the multiplicant changes.
>> >>
>> >> Tamar Christina  writes:
>> >> > Hi,
>> >> >
>> >> >> Richard Sandiford  writes:
>> >> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
>> >> >> *vinfo,
>> >> >> >>/* FORNOW.  Can continue analyzing the def-use chain when
>> >> >> >> this stmt in
>> >> >> a phi
>> >> >> >>   inside the loop (in case we are analyzing an outer-loop).  */
>> >> >> >>vect_unpromoted_value unprom0[2];
>> >> >> >> +  enum optab_subtype subtype = optab_vector;
>> >> >> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
>> >> >> WIDEN_MULT_EXPR,
>> >> >> >> -false, 2, unprom0, _type))
>> >> >> >> +false, 2, unprom0, _type, ))
>> >> >> >> +return NULL;
>> >> >> >> +
>> >> >> >> +  if (subtype == optab_vector_mixed_sign
>> >> >> >> +  && TYPE_UNSIGNED (unprom_mult.type)
>> >> >> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
>> >> >> >> + (unprom_mult.type))
>> >> >> >>  return NULL;
>> >> >> >
>> >> >> > Isn't the final condition here instead that TYPE1 is narrower than
>> TYPE2?
>> >> >> > I.e. we need to reject the case in which we multiply a signed
>> >> >> > and an unsigned value to get a (logically) signed result, but
>> >> >> > then zero-extend it (rather than sign-extend it) to the
>> >> >> > precision of the
>> >> addition.
>> >> >> >
>> >> >> > That would make the test:
>> >> >> >
>> >> >> >   if (subtype == optab_vector_mixed_sign
>> >> >> >   && TYPE_UNSIGNED (unprom_mult.type)
>> >> >> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION
>> (type))
>> >> >> > return NULL;
>> >> >> >
>> >> >> > instead.
>> >> >>
>> >> >> And folding that into the existing test gives:
>> >> >>
>> >> >>   /* If there are two widening operations, make sure they agree on
>> >> >> the
>> >> sign
>> >> >>  of the extension.  The result of an optab_vector_mixed_sign
>> operation
>> >> >>  is signed; otherwise, the result has the same sign as the 
>> >> >> operands.
>> */
>> >> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>> >> >>   && (subtype == optab_vector_mixed_sign
>> >> >>  ? TYPE_UNSIGNED (unprom_mult.type)
>> >> >>  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
>> >> >> return NULL;
>> >> >>
>> >> >
>> >> > I went with the first one which doesn't add the extra constraints
>> >> > for the normal dotproduct as that makes it too restrictive. It's
>> >> > the type of the multiplication that determines the operation so
>> >> > dotproduct can be used a bit more than where we currently do.
>> >> >
>> >> > This was relaxed in an earlier patch.
>> >>
>> >> I didn't mean that we should add extra constraints to the normal case
>> though.
>> >> The existing test I was referring to above was:
>> >>
>> >>   /* If there are two widening operations, make sure they agree on
>> >>  the sign of the extension.  */
>> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>> >>   && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
>> >> return NULL;
>> >
>> > But as I mentioned, this restriction is unneeded and has been removed
>> hence why it's not in my patchset's diff.
>> > It's removed by
>> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html which
>> Richi conditioned on the rest of these patches being approved.
>> >
>> > This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from
>> > being dotproducts for instance
>> >
>> > It's also part of the deficiency between GCC codegen and Clang
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6
>> 
>> Hmm, OK.  Just removing the check regresses:
>> 
>> unsigned long __attribute__ ((noipa))
>> f (signed short *x, signed short *y)
>> {
>>   unsigned long res = 0;
>>   for (int i = 0; i < 100; ++i)
>> res += (unsigned int) x[i] * (unsigned int) y[i];
>>   return res;
>> }
>> 
>> int
>> main (void)
>> {
>>   signed short x[100], y[100];
>>   for (int i = 0; i < 100; ++i)
>> {
>>   x[i] = -1;
>>   y[i] = 1;
>> }
>>   if (f (x, y) != 0x64ULL - 100)
>> __builtin_abort ();
>>   return 0;
>> }
>> 
>> on SVE.  We then use SDOT even though the result of the multiplication is
>> zero- rather than sign-extended to 64 bits.  Does something else in the 
>> 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Tamar Christina via Gcc-patches


> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, July 12, 2021 11:26 AM
> To: Tamar Christina 
> Cc: Richard Biener ; nd ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> Tamar Christina  writes:
> >> -Original Message-
> >> From: Richard Sandiford 
> >> Sent: Monday, July 12, 2021 10:39 AM
> >> To: Tamar Christina 
> >> Cc: Richard Biener ; nd ; gcc-
> >> patc...@gcc.gnu.org
> >> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> >> where the sign for the multiplicant changes.
> >>
> >> Tamar Christina  writes:
> >> > Hi,
> >> >
> >> >> Richard Sandiford  writes:
> >> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
> >> >> *vinfo,
> >> >> >>/* FORNOW.  Can continue analyzing the def-use chain when
> >> >> >> this stmt in
> >> >> a phi
> >> >> >>   inside the loop (in case we are analyzing an outer-loop).  */
> >> >> >>vect_unpromoted_value unprom0[2];
> >> >> >> +  enum optab_subtype subtype = optab_vector;
> >> >> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
> >> >> WIDEN_MULT_EXPR,
> >> >> >> - false, 2, unprom0, _type))
> >> >> >> + false, 2, unprom0, _type, ))
> >> >> >> +return NULL;
> >> >> >> +
> >> >> >> +  if (subtype == optab_vector_mixed_sign
> >> >> >> +  && TYPE_UNSIGNED (unprom_mult.type)
> >> >> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
> >> >> >> + (unprom_mult.type))
> >> >> >>  return NULL;
> >> >> >
> >> >> > Isn't the final condition here instead that TYPE1 is narrower than
> TYPE2?
> >> >> > I.e. we need to reject the case in which we multiply a signed
> >> >> > and an unsigned value to get a (logically) signed result, but
> >> >> > then zero-extend it (rather than sign-extend it) to the
> >> >> > precision of the
> >> addition.
> >> >> >
> >> >> > That would make the test:
> >> >> >
> >> >> >   if (subtype == optab_vector_mixed_sign
> >> >> >   && TYPE_UNSIGNED (unprom_mult.type)
> >> >> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION
> (type))
> >> >> > return NULL;
> >> >> >
> >> >> > instead.
> >> >>
> >> >> And folding that into the existing test gives:
> >> >>
> >> >>   /* If there are two widening operations, make sure they agree on
> >> >> the
> >> sign
> >> >>  of the extension.  The result of an optab_vector_mixed_sign
> operation
> >> >>  is signed; otherwise, the result has the same sign as the operands.
> */
> >> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
> >> >>   && (subtype == optab_vector_mixed_sign
> >> >>   ? TYPE_UNSIGNED (unprom_mult.type)
> >> >>   : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
> >> >> return NULL;
> >> >>
> >> >
> >> > I went with the first one which doesn't add the extra constraints
> >> > for the normal dotproduct as that makes it too restrictive. It's
> >> > the type of the multiplication that determines the operation so
> >> > dotproduct can be used a bit more than where we currently do.
> >> >
> >> > This was relaxed in an earlier patch.
> >>
> >> I didn't mean that we should add extra constraints to the normal case
> though.
> >> The existing test I was referring to above was:
> >>
> >>   /* If there are two widening operations, make sure they agree on
> >>  the sign of the extension.  */
> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
> >>   && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
> >> return NULL;
> >
> > But as I mentioned, this restriction is unneeded and has been removed
> hence why it's not in my patchset's diff.
> > It's removed by
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html which
> Richi conditioned on the rest of these patches being approved.
> >
> > This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from
> > being dotproducts for instance
> >
> > It's also part of the deficiency between GCC codegen and Clang
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6
> 
> Hmm, OK.  Just removing the check regresses:
> 
> unsigned long __attribute__ ((noipa))
> f (signed short *x, signed short *y)
> {
>   unsigned long res = 0;
>   for (int i = 0; i < 100; ++i)
> res += (unsigned int) x[i] * (unsigned int) y[i];
>   return res;
> }
> 
> int
> main (void)
> {
>   signed short x[100], y[100];
>   for (int i = 0; i < 100; ++i)
> {
>   x[i] = -1;
>   y[i] = 1;
> }
>   if (f (x, y) != 0x64ULL - 100)
> __builtin_abort ();
>   return 0;
> }
> 
> on SVE.  We then use SDOT even though the result of the multiplication is
> zero- rather than sign-extended to 64 bits.  Does something else in the series
> stop that from that happening?

No, and I hadn't noticed it before because it looks like the mid-end tests that 
are execution test don't turn on dot-product 

Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, July 12, 2021 10:39 AM
>> To: Tamar Christina 
>> Cc: Richard Biener ; nd ; gcc-
>> patc...@gcc.gnu.org
>> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
>> where the sign for the multiplicant changes.
>> 
>> Tamar Christina  writes:
>> > Hi,
>> >
>> >> Richard Sandiford  writes:
>> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
>> >> *vinfo,
>> >> >>/* FORNOW.  Can continue analyzing the def-use chain when this
>> >> >> stmt in
>> >> a phi
>> >> >>   inside the loop (in case we are analyzing an outer-loop).  */
>> >> >>vect_unpromoted_value unprom0[2];
>> >> >> +  enum optab_subtype subtype = optab_vector;
>> >> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
>> >> WIDEN_MULT_EXPR,
>> >> >> -   false, 2, unprom0, _type))
>> >> >> +   false, 2, unprom0, _type, ))
>> >> >> +return NULL;
>> >> >> +
>> >> >> +  if (subtype == optab_vector_mixed_sign
>> >> >> +  && TYPE_UNSIGNED (unprom_mult.type)
>> >> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
>> >> >> + (unprom_mult.type))
>> >> >>  return NULL;
>> >> >
>> >> > Isn't the final condition here instead that TYPE1 is narrower than 
>> >> > TYPE2?
>> >> > I.e. we need to reject the case in which we multiply a signed and
>> >> > an unsigned value to get a (logically) signed result, but then
>> >> > zero-extend it (rather than sign-extend it) to the precision of the
>> addition.
>> >> >
>> >> > That would make the test:
>> >> >
>> >> >   if (subtype == optab_vector_mixed_sign
>> >> >   && TYPE_UNSIGNED (unprom_mult.type)
>> >> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type))
>> >> > return NULL;
>> >> >
>> >> > instead.
>> >>
>> >> And folding that into the existing test gives:
>> >>
>> >>   /* If there are two widening operations, make sure they agree on the
>> sign
>> >>  of the extension.  The result of an optab_vector_mixed_sign operation
>> >>  is signed; otherwise, the result has the same sign as the operands.  
>> >> */
>> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>> >>   && (subtype == optab_vector_mixed_sign
>> >> ? TYPE_UNSIGNED (unprom_mult.type)
>> >> : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
>> >> return NULL;
>> >>
>> >
>> > I went with the first one which doesn't add the extra constraints for
>> > the normal dotproduct as that makes it too restrictive. It's the type
>> > of the multiplication that determines the operation so dotproduct can
>> > be used a bit more than where we currently do.
>> >
>> > This was relaxed in an earlier patch.
>> 
>> I didn't mean that we should add extra constraints to the normal case though.
>> The existing test I was referring to above was:
>> 
>>   /* If there are two widening operations, make sure they agree on
>>  the sign of the extension.  */
>>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>>   && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
>> return NULL;
>
> But as I mentioned, this restriction is unneeded and has been removed hence 
> why it's not in my patchset's diff.
> It's removed by 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html which Richi 
> conditioned on
> the rest of these patches being approved.
>
> This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from being 
> dotproducts for instance
>
> It's also part of the deficiency between GCC codegen and Clang 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6

Hmm, OK.  Just removing the check regresses:

unsigned long __attribute__ ((noipa))
f (signed short *x, signed short *y)
{
  unsigned long res = 0;
  for (int i = 0; i < 100; ++i)
res += (unsigned int) x[i] * (unsigned int) y[i];
  return res;
}

int
main (void)
{
  signed short x[100], y[100];
  for (int i = 0; i < 100; ++i)
{
  x[i] = -1;
  y[i] = 1;
}
  if (f (x, y) != 0x64ULL - 100)
__builtin_abort ();
  return 0;
}

on SVE.  We then use SDOT even though the result of the multiplication
is zero- rather than sign-extended to 64 bits.  Does something else
in the series stop that from that happening?

Richard


RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Tamar Christina via Gcc-patches



> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, July 12, 2021 10:39 AM
> To: Tamar Christina 
> Cc: Richard Biener ; nd ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> Tamar Christina  writes:
> > Hi,
> >
> >> Richard Sandiford  writes:
> >> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
> >> *vinfo,
> >> >>/* FORNOW.  Can continue analyzing the def-use chain when this
> >> >> stmt in
> >> a phi
> >> >>   inside the loop (in case we are analyzing an outer-loop).  */
> >> >>vect_unpromoted_value unprom0[2];
> >> >> +  enum optab_subtype subtype = optab_vector;
> >> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
> >> WIDEN_MULT_EXPR,
> >> >> -false, 2, unprom0, _type))
> >> >> +false, 2, unprom0, _type, ))
> >> >> +return NULL;
> >> >> +
> >> >> +  if (subtype == optab_vector_mixed_sign
> >> >> +  && TYPE_UNSIGNED (unprom_mult.type)
> >> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
> >> >> + (unprom_mult.type))
> >> >>  return NULL;
> >> >
> >> > Isn't the final condition here instead that TYPE1 is narrower than TYPE2?
> >> > I.e. we need to reject the case in which we multiply a signed and
> >> > an unsigned value to get a (logically) signed result, but then
> >> > zero-extend it (rather than sign-extend it) to the precision of the
> addition.
> >> >
> >> > That would make the test:
> >> >
> >> >   if (subtype == optab_vector_mixed_sign
> >> >   && TYPE_UNSIGNED (unprom_mult.type)
> >> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type))
> >> > return NULL;
> >> >
> >> > instead.
> >>
> >> And folding that into the existing test gives:
> >>
> >>   /* If there are two widening operations, make sure they agree on the
> sign
> >>  of the extension.  The result of an optab_vector_mixed_sign operation
> >>  is signed; otherwise, the result has the same sign as the operands.  
> >> */
> >>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
> >>   && (subtype == optab_vector_mixed_sign
> >>  ? TYPE_UNSIGNED (unprom_mult.type)
> >>  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
> >> return NULL;
> >>
> >
> > I went with the first one which doesn't add the extra constraints for
> > the normal dotproduct as that makes it too restrictive. It's the type
> > of the multiplication that determines the operation so dotproduct can
> > be used a bit more than where we currently do.
> >
> > This was relaxed in an earlier patch.
> 
> I didn't mean that we should add extra constraints to the normal case though.
> The existing test I was referring to above was:
> 
>   /* If there are two widening operations, make sure they agree on
>  the sign of the extension.  */
>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>   && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
> return NULL;

But as I mentioned, this restriction is unneeded and has been removed hence why 
it's not in my patchset's diff.
It's removed by https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569851.html 
which Richi conditioned on
the rest of these patches being approved.

This change needlessly blocks test vect-reduc-dot-[2,3,6,7].c from being 
dotproducts for instance

It's also part of the deficiency between GCC codegen and Clang 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492#c6

Regards,
Tamar

> 
> Although this existing test makes sense for the normal case, IMO testing
> TYPE_SIGN (half_type) doesn't make sense for the mixed-sign case.  I think
> we should therefore replace the existing test with:
> 
>   /* If there are two widening operations, make sure they agree on the sign
>  of the extension.  The result of an optab_vector_mixed_sign operation
>  is signed; otherwise, the result has the same sign as the operands.  */
>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>   && (subtype == optab_vector_mixed_sign
>  ? TYPE_UNSIGNED (unprom_mult.type)
>  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
> return NULL;
> 
> rather than add a separate condition for the mixed-sign case.
> The behaviour of the normal case is the same both ways.
> 
> Thanks,
> Richard
> 



Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
> Hi,
>
>> Richard Sandiford  writes:
>> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
>> *vinfo,
>> >>/* FORNOW.  Can continue analyzing the def-use chain when this stmt in
>> a phi
>> >>   inside the loop (in case we are analyzing an outer-loop).  */
>> >>vect_unpromoted_value unprom0[2];
>> >> +  enum optab_subtype subtype = optab_vector;
>> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
>> WIDEN_MULT_EXPR,
>> >> -  false, 2, unprom0, _type))
>> >> +  false, 2, unprom0, _type, ))
>> >> +return NULL;
>> >> +
>> >> +  if (subtype == optab_vector_mixed_sign
>> >> +  && TYPE_UNSIGNED (unprom_mult.type)
>> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
>> >> + (unprom_mult.type))
>> >>  return NULL;
>> >
>> > Isn't the final condition here instead that TYPE1 is narrower than TYPE2?
>> > I.e. we need to reject the case in which we multiply a signed and an
>> > unsigned value to get a (logically) signed result, but then
>> > zero-extend it (rather than sign-extend it) to the precision of the 
>> > addition.
>> >
>> > That would make the test:
>> >
>> >   if (subtype == optab_vector_mixed_sign
>> >   && TYPE_UNSIGNED (unprom_mult.type)
>> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type))
>> > return NULL;
>> >
>> > instead.
>> 
>> And folding that into the existing test gives:
>> 
>>   /* If there are two widening operations, make sure they agree on the sign
>>  of the extension.  The result of an optab_vector_mixed_sign operation
>>  is signed; otherwise, the result has the same sign as the operands.  */
>>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>>   && (subtype == optab_vector_mixed_sign
>>? TYPE_UNSIGNED (unprom_mult.type)
>>: TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
>> return NULL;
>> 
>
> I went with the first one which doesn't add the extra constraints for the
> normal dotproduct as that makes it too restrictive. It's the type of the
> multiplication that determines the operation so dotproduct can be used
> a bit more than where we currently do.
>
> This was relaxed in an earlier patch.

I didn't mean that we should add extra constraints to the normal case
though.  The existing test I was referring to above was:

  /* If there are two widening operations, make sure they agree on
 the sign of the extension.  */
  if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
  && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
return NULL;

Although this existing test makes sense for the normal case, IMO testing
TYPE_SIGN (half_type) doesn't make sense for the mixed-sign case.  I think
we should therefore replace the existing test with:

  /* If there are two widening operations, make sure they agree on the sign
 of the extension.  The result of an optab_vector_mixed_sign operation
 is signed; otherwise, the result has the same sign as the operands.  */
  if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
  && (subtype == optab_vector_mixed_sign
 ? TYPE_UNSIGNED (unprom_mult.type)
 : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
return NULL;

rather than add a separate condition for the mixed-sign case.
The behaviour of the normal case is the same both ways.

Thanks,
Richard




RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-07-12 Thread Tamar Christina via Gcc-patches
Hi,

> Richard Sandiford  writes:
> >> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info
> *vinfo,
> >>/* FORNOW.  Can continue analyzing the def-use chain when this stmt in
> a phi
> >>   inside the loop (in case we are analyzing an outer-loop).  */
> >>vect_unpromoted_value unprom0[2];
> >> +  enum optab_subtype subtype = optab_vector;
> >>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR,
> WIDEN_MULT_EXPR,
> >> -   false, 2, unprom0, _type))
> >> +   false, 2, unprom0, _type, ))
> >> +return NULL;
> >> +
> >> +  if (subtype == optab_vector_mixed_sign
> >> +  && TYPE_UNSIGNED (unprom_mult.type)
> >> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION
> >> + (unprom_mult.type))
> >>  return NULL;
> >
> > Isn't the final condition here instead that TYPE1 is narrower than TYPE2?
> > I.e. we need to reject the case in which we multiply a signed and an
> > unsigned value to get a (logically) signed result, but then
> > zero-extend it (rather than sign-extend it) to the precision of the 
> > addition.
> >
> > That would make the test:
> >
> >   if (subtype == optab_vector_mixed_sign
> >   && TYPE_UNSIGNED (unprom_mult.type)
> >   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type))
> > return NULL;
> >
> > instead.
> 
> And folding that into the existing test gives:
> 
>   /* If there are two widening operations, make sure they agree on the sign
>  of the extension.  The result of an optab_vector_mixed_sign operation
>  is signed; otherwise, the result has the same sign as the operands.  */
>   if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
>   && (subtype == optab_vector_mixed_sign
> ? TYPE_UNSIGNED (unprom_mult.type)
> : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
> return NULL;
> 

I went with the first one which doesn't add the extra constraints for the
normal dotproduct as that makes it too restrictive. It's the type of the
multiplication that determines the operation so dotproduct can be used
a bit more than where we currently do.

This was relaxed in an earlier patch.

Updated patch attached.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* optabs.def (usdot_prod_optab): New.
* doc/md.texi: Document it and clarify other dot prod optabs.
* optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
* optabs.c (expand_widen_pattern_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Query dot-product kind.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
optab subtype.
(vect_widened_op_tree): Optionally ignore
mismatch types.
(vect_recog_dot_prod_pattern): Support usdot_prod_optab.

 Inline copy of patch 

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
1b91814433057b1b377283fd1f40cb970dc3d243..323ba8eab78e2b2e582fa0633752930182e83ee5
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5446,13 +5446,55 @@ Like @samp{fold_left_plus_@var{m}}, but takes an 
additional mask operand
 
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
+
+Compute the sum of the products of two signed elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+sdot ==
+   res = sign-ext (a) * sign-ext (b) + c
+@dots{}
+@end smallexample
+
 @cindex @code{udot_prod@var{m}} instruction pattern
-@itemx @samp{udot_prod@var{m}}
-Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+@item @samp{udot_prod@var{m}}
+
+Compute the sum of the products of two unsigned elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+udot ==
+   res = zero-ext (a) * zero-ext (b) + c
+@dots{}
+@end smallexample
+
+
+
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@item @samp{usdot_prod@var{m}}
+Compute the sum of the products of elements of different signs.
+Operand 

Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-22 Thread Richard Sandiford via Gcc-patches
Richard Sandiford  writes:
>> @@ -992,21 +1029,27 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
>>/* FORNOW.  Can continue analyzing the def-use chain when this stmt in a 
>> phi
>>   inside the loop (in case we are analyzing an outer-loop).  */
>>vect_unpromoted_value unprom0[2];
>> +  enum optab_subtype subtype = optab_vector;
>>if (!vect_widened_op_tree (vinfo, mult_vinfo, MULT_EXPR, WIDEN_MULT_EXPR,
>> - false, 2, unprom0, _type))
>> + false, 2, unprom0, _type, ))
>> +return NULL;
>> +
>> +  if (subtype == optab_vector_mixed_sign
>> +  && TYPE_UNSIGNED (unprom_mult.type)
>> +  && TYPE_PRECISION (half_type) * 4 > TYPE_PRECISION (unprom_mult.type))
>>  return NULL;
>
> Isn't the final condition here instead that TYPE1 is narrower than TYPE2?
> I.e. we need to reject the case in which we multiply a signed and an
> unsigned value to get a (logically) signed result, but then zero-extend
> it (rather than sign-extend it) to the precision of the addition.
>
> That would make the test:
>
>   if (subtype == optab_vector_mixed_sign
>   && TYPE_UNSIGNED (unprom_mult.type)
>   && TYPE_PRECISION (unprom_mult.type) < TYPE_PRECISION (type))
> return NULL;
>   
> instead.

And folding that into the existing test gives:

  /* If there are two widening operations, make sure they agree on the sign
 of the extension.  The result of an optab_vector_mixed_sign operation
 is signed; otherwise, the result has the same sign as the operands.  */
  if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
  && (subtype == optab_vector_mixed_sign
  ? TYPE_UNSIGNED (unprom_mult.type)
  : TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type)))
return NULL;

Thanks,
Richard


Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-22 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review.

Just concentrating on tree-vect-patterns.c, as before:

Tamar Christina  writes:
> @@ -521,6 +522,9 @@ vect_joust_widened_type (tree type, tree new_type, tree 
> *common_type)
>unsigned int precision = MAX (TYPE_PRECISION (*common_type),
>   TYPE_PRECISION (new_type));
>precision *= 2;
> +
> +  /* The resulting application is unsigned, check if we have enough
> + precision to perform the operation.  */
>if (precision * 2 > TYPE_PRECISION (type))
>  return false;
>  

Not sure what the comment means by “application” here, but the common
type we pick is signed rather than unsigned.

> @@ -546,7 +554,8 @@ static unsigned int
>  vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code 
> code,
> tree_code widened_code, bool shift_p,
> unsigned int max_nops,
> -   vect_unpromoted_value *unprom, tree *common_type)
> +   vect_unpromoted_value *unprom, tree *common_type,
> +   enum optab_subtype *subtype = NULL)
>  {
>/* Check for an integer operation with the right code.  */
>gassign *assign = dyn_cast  (stmt_info->stmt);
> @@ -607,7 +616,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info 
> stmt_info, tree_code code,
>   = vinfo->lookup_def (this_unprom->op);
> nops = vect_widened_op_tree (vinfo, def_stmt_info, code,
>  widened_code, shift_p, max_nops,
> -this_unprom, common_type);
> +this_unprom, common_type,
> +subtype);
> if (nops == 0)
>   return 0;
>  
> @@ -625,7 +635,24 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info 
> stmt_info, tree_code code,
>   *common_type = this_unprom->type;
> else if (!vect_joust_widened_type (type, this_unprom->type,
>common_type))
> - return 0;
> + {
> +   if (subtype)
> + {

AIUI, if we get here then:

- there must be one unsigned operand (A) of precision P
- there must be one signed operand (B) with precision <= P
- we can't extend to precision 2*P 

A conversion is needed if B's precision is < P.
That conversion should be to a signed type with precision P.

So…

> +   tree new_type = *common_type;
> +   /* See if we can sign extend the smaller type.  */
> +   if (TYPE_PRECISION (this_unprom->type) > TYPE_PRECISION 
> (new_type)
> +   && (TYPE_UNSIGNED (this_unprom->type) && 
> !TYPE_UNSIGNED (new_type)))

…I think this second line could be an assert and

> + new_type = build_nonstandard_integer_type 
> (TYPE_PRECISION (this_unprom->type), true);

…picking an unsigned type here looks wrong.  The net effect would
be to convert B (the previous signed operand) to an unsigned type.

> +
> +   if (tree_nop_conversion_p (this_unprom->type, new_type))
> + {
> +   *subtype = optab_vector_mixed_sign;
> +   *common_type = new_type;
> + }

IMO the sign of the common type shouldn't matter for optab_vector_mixed_sign:
if we need to convert operands later, it should be to the precision of
the common type but retaining the sign of the original type.
So I think it would be simpler to do:

  if (TYPE_PRECISION (this_unprom->type)
  > TYPE_PRECISION (*common_type)
*common_type = this_unprom->type;
  *subtype = optab_vector_mixed_sign;

here and adjust the conversion code as described below.

This also has the advantage of coping with > 2 operands, in case that
ever becomes important in future.

> @@ -806,12 +833,15 @@ vect_convert_input (vec_info *vinfo, stmt_vec_info 
> stmt_info, tree type,
>  }
>  
>  /* Invoke vect_convert_input for N elements of UNPROM and store the
> -   result in the corresponding elements of RESULT.  */
> +   result in the corresponding elements of RESULT.
> +
> +   If SUBTYPE then don't convert the types if they only
> +   differ by sign.  */
>  
>  static void
>  vect_convert_inputs (vec_info *vinfo, stmt_vec_info stmt_info, unsigned int 
> n,
>tree *result, tree type, vect_unpromoted_value *unprom,
> -  tree vectype)
> +  tree vectype, enum optab_subtype subtype = optab_default)
>  {
>for (unsigned int i = 0; i < n; ++i)
>  {
> @@ -819,8 +849,12 @@ vect_convert_inputs (vec_info *vinfo, stmt_vec_info 
> stmt_info, unsigned int n,
>for (j = 0; j < i; ++j)
>   if (unprom[j].op == unprom[i].op)
> break;
> +
>if (j < i)
>   result[i] = result[j];
> +  else if (subtype == 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-21 Thread Tamar Christina via Gcc-patches
Ping

> -Original Message-
> From: Gcc-patches  bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Tamar
> Christina via Gcc-patches
> Sent: Monday, June 14, 2021 1:06 PM
> To: Richard Sandiford 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Biener
> 
> Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> Hi Richard,
> 
> I've attached a new version of the patch with the changes.
> I have also added 7 new tests in the testsuite to check the cases you
> mentioned.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * optabs.def (usdot_prod_optab): New.
>   * doc/md.texi: Document it and clarify other dot prod optabs.
>   * optabs-tree.h (enum optab_subtype): Add
> optab_vector_mixed_sign.
>   * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
>   * optabs.c (expand_widen_pattern_expr): Likewise.
>   * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>   * tree-vect-loop.c (vectorizable_reduction): Query dot-product kind.
>   * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> optional
>   optab subtype.
>   (vect_widened_op_tree): Optionally ignore
>   mismatch types.
>   (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> 00caf3844ccf8ea289d581839766502d51b9e8d7..1356afb7f903f17c198103562b
> 5cd145ecb9966f 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5446,13 +5446,55 @@ Like @samp{fold_left_plus_@var{m}}, but takes
> an additional mask operand
> 
>  @cindex @code{sdot_prod@var{m}} instruction pattern  @item
> @samp{sdot_prod@var{m}}
> +
> +Compute the sum of the products of two signed elements.
> +Operand 1 and operand 2 are of the same mode. Their product, which is
> +of a wider mode, is computed and added to operand 3.
> +Operand 3 is of a mode equal or wider than the mode of the product. The
> +result is placed in operand 0, which is of the same mode as operand 3.
> +
> +Semantically the expressions perform the multiplication in the
> +following signs
> +
> +@smallexample
> +sdot ==
> +   res = sign-ext (a) * sign-ext (b) + c @dots{} @end smallexample
> +
>  @cindex @code{udot_prod@var{m}} instruction pattern -@itemx
> @samp{udot_prod@var{m}} -Compute the sum of the products of two
> signed/unsigned elements.
> -Operand 1 and operand 2 are of the same mode. Their product, which is of a
> -wider mode, is computed and added to operand 3. Operand 3 is of a mode
> equal or -wider than the mode of the product. The result is placed in operand
> 0, which -is of the same mode as operand 3.
> +@item @samp{udot_prod@var{m}}
> +
> +Compute the sum of the products of two unsigned elements.
> +Operand 1 and operand 2 are of the same mode. Their product, which is
> +of a wider mode, is computed and added to operand 3.
> +Operand 3 is of a mode equal or wider than the mode of the product. The
> +result is placed in operand 0, which is of the same mode as operand 3.
> +
> +Semantically the expressions perform the multiplication in the
> +following signs
> +
> +@smallexample
> +udot ==
> +   res = zero-ext (a) * zero-ext (b) + c @dots{} @end smallexample
> +
> +
> +
> +@cindex @code{usdot_prod@var{m}} instruction pattern
> +@item @samp{usdot_prod@var{m}}
> +Compute the sum of the products of elements of different signs.
> +Operand 1 must be unsigned and operand 2 signed. Their
> +product, which is of a wider mode, is computed and added to operand 3.
> +Operand 3 is of a mode equal or wider than the mode of the product. The
> +result is placed in operand 0, which is of the same mode as operand 3.
> +
> +Semantically the expressions perform the multiplication in the following
> signs
> +
> +@smallexample
> +usdot ==
> +   res = ((unsigned-conv) sign-ext (a)) * zero-ext (b) + c
> +@dots{}
> +@end smallexample
> 
>  @cindex @code{ssad@var{m}} instruction pattern
>  @item @samp{ssad@var{m}}
> diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
> index
> c3aaa1a416991e856d3e24da45968a92ebada82c..fbd2b06b8dbfd560dfb66b31
> 4830e6b564b37abb 100644
> --- a/gcc/optabs-tree.h
> +++ b/gcc/optabs-tree.h
> @@ -29,7 +29,8 @@ enum optab_subtype
>  {
>optab_default,
>optab_scalar,
> -  optab_vector
> +  optab_vector,
> +  optab_vector_mixed_sign
>  };
> 
>  /* Return the optab used for computing the given operation on the type
> given by
> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
> index
> 95ffe397c23e80c105afea52e9d47216bf52f55a..eeb5aeed3202cc6971b6447994
> bc5311e9c010bb 100644
> --- a/gcc/optabs-tree.c
> +++ b/gcc/optabs-tree.c
> @@ -127,7 +127,12 @@ optab_for_tree_code (enum tree_code code,
> const_tree type,
>return TYPE_UNSIGNED (type) ? usum_widen_optab :
> ssum_widen_optab;
> 
>  case DOT_PROD_EXPR:
> -  return TYPE_UNSIGNED (type) ? 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-14 Thread Tamar Christina via Gcc-patches
Hi Richard,

I've attached a new version of the patch with the changes.
I have also added 7 new tests in the testsuite to check the cases you mentioned.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* optabs.def (usdot_prod_optab): New.
* doc/md.texi: Document it and clarify other dot prod optabs.
* optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
* optabs.c (expand_widen_pattern_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Query dot-product kind.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
optab subtype.
(vect_widened_op_tree): Optionally ignore
mismatch types.
(vect_recog_dot_prod_pattern): Support usdot_prod_optab.

--- inline copy of patch ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
00caf3844ccf8ea289d581839766502d51b9e8d7..1356afb7f903f17c198103562b5cd145ecb9966f
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5446,13 +5446,55 @@ Like @samp{fold_left_plus_@var{m}}, but takes an 
additional mask operand
 
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
+
+Compute the sum of the products of two signed elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+sdot ==
+   res = sign-ext (a) * sign-ext (b) + c
+@dots{}
+@end smallexample
+
 @cindex @code{udot_prod@var{m}} instruction pattern
-@itemx @samp{udot_prod@var{m}}
-Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+@item @samp{udot_prod@var{m}}
+
+Compute the sum of the products of two unsigned elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+udot ==
+   res = zero-ext (a) * zero-ext (b) + c
+@dots{}
+@end smallexample
+
+
+
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@item @samp{usdot_prod@var{m}}
+Compute the sum of the products of elements of different signs.
+Operand 1 must be unsigned and operand 2 signed. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+usdot ==
+   res = ((unsigned-conv) sign-ext (a)) * zero-ext (b) + c
+@dots{}
+@end smallexample
 
 @cindex @code{ssad@var{m}} instruction pattern
 @item @samp{ssad@var{m}}
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index 
c3aaa1a416991e856d3e24da45968a92ebada82c..fbd2b06b8dbfd560dfb66b314830e6b564b37abb
 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -29,7 +29,8 @@ enum optab_subtype
 {
   optab_default,
   optab_scalar,
-  optab_vector
+  optab_vector,
+  optab_vector_mixed_sign
 };
 
 /* Return the optab used for computing the given operation on the type given by
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 
95ffe397c23e80c105afea52e9d47216bf52f55a..eeb5aeed3202cc6971b6447994bc5311e9c010bb
 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -127,7 +127,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
   return TYPE_UNSIGNED (type) ? usum_widen_optab : ssum_widen_optab;
 
 case DOT_PROD_EXPR:
-  return TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab;
+  {
+   if (subtype == optab_vector_mixed_sign)
+ return usdot_prod_optab;
+
+   return (TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab);
+  }
 
 case SAD_EXPR:
   return TYPE_UNSIGNED (type) ? usad_optab : ssad_optab;
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 
62a6bdb4c59bf8263c499245795576199606d372..14d8ad2f33fd75388435fe912380e177f8f3c54b
 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, 
rtx wide_op,
   bool sbool = false;
 
   oprnd0 = ops->op0;
+  

Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-07 Thread Richard Sandiford via Gcc-patches
Sorry for the slow response.

Tamar Christina  writes:
> […]
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index 
> 441d6cd28c4eaded7abd756164890dbcffd2f3b8..82123b96313e6783ea214b9259805d65c07d8858
>  100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -201,7 +201,8 @@ vect_get_external_def_edge (vec_info *vinfo, tree var)
>  static bool
>  vect_supportable_direct_optab_p (vec_info *vinfo, tree otype, tree_code code,
>  tree itype, tree *vecotype_out,
> -tree *vecitype_out = NULL)
> +tree *vecitype_out = NULL,
> +enum optab_subtype subtype = optab_default)
>  {
>tree vecitype = get_vectype_for_scalar_type (vinfo, itype);
>if (!vecitype)
> @@ -211,7 +212,7 @@ vect_supportable_direct_optab_p (vec_info *vinfo, tree 
> otype, tree_code code,
>if (!vecotype)
>  return false;
>
> -  optab optab = optab_for_tree_code (code, vecitype, optab_default);
> +  optab optab = optab_for_tree_code (code, vecitype, subtype);
>if (!optab)
>  return false;
>
> @@ -487,10 +488,14 @@ vect_joust_widened_integer (tree type, bool shift_p, 
> tree op,
>  }
>
>  /* Return true if the common supertype of NEW_TYPE and *COMMON_TYPE
> -   is narrower than type, storing the supertype in *COMMON_TYPE if so.  */
> +   is narrower than type, storing the supertype in *COMMON_TYPE if so.
> +   If UNPROM_TYPE then accept that *COMMON_TYPE and NEW_TYPE may be of
> +   different signs but equal precision and that the resulting
> +   multiplication of them be compatible with UNPROM_TYPE.   */
>
>  static bool
> -vect_joust_widened_type (tree type, tree new_type, tree *common_type)
> +vect_joust_widened_type (tree type, tree new_type, tree *common_type,
> +tree unprom_type = NULL)
>  {
>if (types_compatible_p (*common_type, new_type))
>  return true;
> @@ -514,7 +519,18 @@ vect_joust_widened_type (tree type, tree new_type, tree 
> *common_type)
>unsigned int precision = MAX (TYPE_PRECISION (*common_type),
> TYPE_PRECISION (new_type));
>precision *= 2;
> -  if (precision * 2 > TYPE_PRECISION (type))
> +
> +  /* Check if the mismatch is only in the sign and if we have
> + UNPROM_TYPE then allow it if there is enough precision to
> + not lose any information during the conversion.  */
> +  if (unprom_type
> +  && TYPE_SIGN (unprom_type) == SIGNED
> +  && tree_nop_conversion_p (*common_type, new_type))
> +   return true;
> +
> +  /* The resulting application is unsigned, check if we have enough
> + precision to perform the operation.  */
> +  if (precision * 2 > TYPE_PRECISION (unprom_type ? unprom_type : type))
>  return false;
>
>*common_type = build_nonstandard_integer_type (precision, false);
> @@ -532,6 +548,10 @@ vect_joust_widened_type (tree type, tree new_type, tree 
> *common_type)
> to a type that (a) is narrower than the result of STMT_INFO and
> (b) can hold all leaf operand values.
>
> +   If UNPROM_TYPE then allow that the signs of the operands
> +   may differ in signs but not in precision and that the resulting type
> +   of the operation on the operands is compatible with UNPROM_TYPE.
> +
> Return 0 if STMT_INFO isn't such a tree, or if no such COMMON_TYPE
> exists.  */
>
> @@ -539,7 +559,8 @@ static unsigned int
>  vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code 
> code,
>   tree_code widened_code, bool shift_p,
>   unsigned int max_nops,
> - vect_unpromoted_value *unprom, tree *common_type)
> + vect_unpromoted_value *unprom, tree *common_type,
> + tree unprom_type = NULL)
>  {
>/* Check for an integer operation with the right code.  */
>gassign *assign = dyn_cast  (stmt_info->stmt);
> @@ -600,7 +621,8 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info 
> stmt_info, tree_code code,
> = vinfo->lookup_def (this_unprom->op);
>   nops = vect_widened_op_tree (vinfo, def_stmt_info, code,
>widened_code, shift_p, max_nops,
> -  this_unprom, common_type);
> +  this_unprom, common_type,
> +  unprom_type);
>   if (nops == 0)
> return 0;
>
> @@ -617,7 +639,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info 
> stmt_info, tree_code code,
>   if (i == 0)
> *common_type = this_unprom->type;
>   else if (!vect_joust_widened_type (type, this_unprom->type,
> -common_type))
> +common_type, unprom_type))
> return 0;
> 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-04 Thread Tamar Christina via Gcc-patches
Hi Richi,

Attached is re-spun patch.  tree_nop_conversion_p was very handy in cleaning up 
the patch, Thanks!

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master if Richard S has no comments?

Thanks,
Tamar

gcc/ChangeLog:

* optabs.def (usdot_prod_optab): New.
* doc/md.texi: Document it and clarify other dot prod optabs.
* optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
* optabs.c (expand_widen_pattern_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Query dot-product kind.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
optab subtype.
(vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
mismatch types.
(vect_recog_dot_prod_pattern): Support usdot_prod_optab.


--- inline copy of patch ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..9fad3322b3f1eb2a836833bb390df78f0cd9734b
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5438,13 +5438,55 @@ Like @samp{fold_left_plus_@var{m}}, but takes an 
additional mask operand
 
 @cindex @code{sdot_prod@var{m}} instruction pattern
 @item @samp{sdot_prod@var{m}}
+
+Compute the sum of the products of two signed elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+sdot ==
+   res = sign-ext (a) * sign-ext (b) + c
+@dots{}
+@end smallexample
+
 @cindex @code{udot_prod@var{m}} instruction pattern
-@itemx @samp{udot_prod@var{m}}
-Compute the sum of the products of two signed/unsigned elements.
-Operand 1 and operand 2 are of the same mode. Their product, which is of a
-wider mode, is computed and added to operand 3. Operand 3 is of a mode equal or
-wider than the mode of the product. The result is placed in operand 0, which
-is of the same mode as operand 3.
+@item @samp{udot_prod@var{m}}
+
+Compute the sum of the products of two unsigned elements.
+Operand 1 and operand 2 are of the same mode. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+udot ==
+   res = zero-ext (a) * zero-ext (b) + c
+@dots{}
+@end smallexample
+
+
+
+@cindex @code{usdot_prod@var{m}} instruction pattern
+@item @samp{usdot_prod@var{m}}
+Compute the sum of the products of elements of different signs.
+Operand 1 must be unsigned and operand 2 signed. Their
+product, which is of a wider mode, is computed and added to operand 3.
+Operand 3 is of a mode equal or wider than the mode of the product. The
+result is placed in operand 0, which is of the same mode as operand 3.
+
+Semantically the expressions perform the multiplication in the following signs
+
+@smallexample
+usdot ==
+   res = ((unsigned-conv) sign-ext (a)) * zero-ext (b) + c
+@dots{}
+@end smallexample
 
 @cindex @code{ssad@var{m}} instruction pattern
 @item @samp{ssad@var{m}}
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index 
c3aaa1a416991e856d3e24da45968a92ebada82c..fbd2b06b8dbfd560dfb66b314830e6b564b37abb
 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -29,7 +29,8 @@ enum optab_subtype
 {
   optab_default,
   optab_scalar,
-  optab_vector
+  optab_vector,
+  optab_vector_mixed_sign
 };
 
 /* Return the optab used for computing the given operation on the type given by
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 
95ffe397c23e80c105afea52e9d47216bf52f55a..eeb5aeed3202cc6971b6447994bc5311e9c010bb
 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -127,7 +127,12 @@ optab_for_tree_code (enum tree_code code, const_tree type,
   return TYPE_UNSIGNED (type) ? usum_widen_optab : ssum_widen_optab;
 
 case DOT_PROD_EXPR:
-  return TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab;
+  {
+   if (subtype == optab_vector_mixed_sign)
+ return usdot_prod_optab;
+
+   return (TYPE_UNSIGNED (type) ? udot_prod_optab : sdot_prod_optab);
+  }
 
 case SAD_EXPR:
   return TYPE_UNSIGNED (type) ? usad_optab : ssad_optab;
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 
f4614a394587787293dc8b680a38901f7906f61c..d9b64441d0e0726afee89dc9c937350451e7670d
 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -262,6 +262,11 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, 
rtx wide_op,
   bool sbool = false;
 
   oprnd0 = 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-02 Thread Tamar Christina via Gcc-patches
Ping,

Did you have any comments Richard S?

Otherwise I'll proceed with respining according to Richi's comments.

Regards,
Tamar

> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, May 26, 2021 9:57 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford
> 
> Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> On Tue, 25 May 2021, Tamar Christina wrote:
> 
> > Hi Richi,
> >
> > Here's a respun version of the patch.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> 
> index
> 7e3aae5f9c28a49feedc7cc66e8ac0d476b9f28a..13e405edd765dde704c64348d
> 2d0b3cd88f0af7c
> 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -4421,7 +4421,9 @@ verify_gimple_assign_ternary (gassign *stmt)
>   && !SCALAR_FLOAT_TYPE_P (rhs1_type))
>  || (!INTEGRAL_TYPE_P (lhs_type)
>  && !SCALAR_FLOAT_TYPE_P (lhs_type
> -   || !types_compatible_p (rhs1_type, rhs2_type)
> +   || (!types_compatible_p (rhs1_type, rhs2_type)
> +   && TYPE_SIGN (rhs1_type) == TYPE_SIGN (rhs2_type)
> +   && TYPE_PRECISION (rhs1_type) != TYPE_PRECISION
> (rhs2_type))
> 
> I think this doesn't capture the constraints - instead please do
> 
> -   || !types_compatible_p (rhs1_type, rhs2_type)
> +   /* rhs1_type and rhs2_type may differ in sign.  */
> +   || !tree_nop_conversion_p (rhs1_type, rhs2_type)
> 
> 
> +/* Determine the optab_subtype to use for the given CODE and STMT.  For
> +   most CODE this will be optab_vector, however for certain operations
> such as
> +   DOT_PROD_EXPR where the operation can different signs for the
> operands
> we
> +   need to be able to pick the right optabs.  */
> +
> +static enum optab_subtype
> +vect_determine_dot_kind (tree_code code, stmt_vec_info stmt_vinfo)
> 
> vect_determine_optab_subkind would be a better name.  'code' is
> redundant (or should better match stmt_vinfo->stmts code).  I wonder
> if it might be clearer to compute the subtype where we compute 'code'
> and the relation to stmt_info is obvious, I mean here:
> 
>   /* 3. Check the operands of the operation.  The first operands are
> defined
> inside the loop body. The last operand is the reduction variable,
> which is defined by the loop-header-phi.  */
> 
>   tree vectype_out = STMT_VINFO_VECTYPE (stmt_info);
>   STMT_VINFO_REDUC_VECTYPE (reduc_info) = vectype_out;
>   gassign *stmt = as_a  (stmt_info->stmt);
>   enum tree_code code = gimple_assign_rhs_code (stmt);
>   bool lane_reduc_code_p
> = (code == DOT_PROD_EXPR || code == WIDEN_SUM_EXPR || code ==
> SAD_EXPR);
> 
> so just add
> 
>   enum optab_subtype optab_query_kind = optab_vector;
>   if (code == DOT_PROD_EXPR
>   && )
> optab_query_kind = optab_vector_mixed_sign;
> 
> in this place and avoid adding the new function?
> 
> I'm not too familiar with the pattern recog code, a 2nd eye would be
> prefered (Richard?), but
> 
> +  /* Check if the mismatch is only in the sign and if we have
> + allow_short_sign_mismatch then allow it.  */
> +  if (unprom_type
> +  && TYPE_SIGN (unprom_type) == SIGNED
> +  && TYPE_SIGN (*common_type) != TYPE_SIGN (new_type))
> +{
> +  bool sign = TYPE_SIGN (*common_type) == UNSIGNED;
> +  tree eq_type
> +   = build_nonstandard_integer_type (TYPE_PRECISION (new_type),
> + sign);
> +
> +  if (types_compatible_p (*common_type, eq_type))
> +   return true;
> +}
> 
> looks somewhat complicated - is that equal to
> 
>   if (unprom_type
>   && tree_nop_conversion_p (*common_type, new_type))
> return true;
> 
> ?  That is, *common_type and new_type only differ in sign?
> 
> @@ -812,8 +844,13 @@ vect_convert_inputs (vec_info *vinfo,
> stmt_vec_info
> stmt_info, unsigned int n,
>for (j = 0; j < i; ++j)
> if (unprom[j].op == unprom[i].op)
>   break;
> +  bool only_sign = allow_short_sign_mismatch
> +  && TYPE_SIGN (type) != TYPE_SIGN (unprom[i].type)
> +  && TYPE_PRECISION (type) == TYPE_PRECISION
> (unprom[i].type);
> 
> this could use the same tree_nop_conversion_p predicate.
> 
> Otherwise the patch looks good.
> 
> Thanks,
> Richard.
> 
> 
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * optabs.def (usdot_prod_optab): New.
> > * doc/md.texi: Document it and clarify other dot prod optabs.
> > * optabs-tree.h (enum optab_subtype): Add
> optab_vector_mixed_sign.
> > * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > * optabs.c (expand_widen_pattern_expr): Likewise.
> > * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > * tree-vect-loop.c (vect_determine_dot_kind): New.
> > (vectorizable_reduction): Query dot-product kind.
> > * tree-vect-patterns.c 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-26 Thread Richard Biener
On Tue, 25 May 2021, Tamar Christina wrote:

> Hi Richi,
> 
> Here's a respun version of the patch.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

index 
7e3aae5f9c28a49feedc7cc66e8ac0d476b9f28a..13e405edd765dde704c64348d2d0b3cd88f0af7c
 
100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -4421,7 +4421,9 @@ verify_gimple_assign_ternary (gassign *stmt)
  && !SCALAR_FLOAT_TYPE_P (rhs1_type))
 || (!INTEGRAL_TYPE_P (lhs_type)
 && !SCALAR_FLOAT_TYPE_P (lhs_type
-   || !types_compatible_p (rhs1_type, rhs2_type)
+   || (!types_compatible_p (rhs1_type, rhs2_type)
+   && TYPE_SIGN (rhs1_type) == TYPE_SIGN (rhs2_type)
+   && TYPE_PRECISION (rhs1_type) != TYPE_PRECISION 
(rhs2_type))

I think this doesn't capture the constraints - instead please do

-   || !types_compatible_p (rhs1_type, rhs2_type)
+   /* rhs1_type and rhs2_type may differ in sign.  */
+   || !tree_nop_conversion_p (rhs1_type, rhs2_type)


+/* Determine the optab_subtype to use for the given CODE and STMT.  For
+   most CODE this will be optab_vector, however for certain operations 
such as
+   DOT_PROD_EXPR where the operation can different signs for the operands 
we
+   need to be able to pick the right optabs.  */
+
+static enum optab_subtype
+vect_determine_dot_kind (tree_code code, stmt_vec_info stmt_vinfo)

vect_determine_optab_subkind would be a better name.  'code' is
redundant (or should better match stmt_vinfo->stmts code).  I wonder
if it might be clearer to compute the subtype where we compute 'code'
and the relation to stmt_info is obvious, I mean here:

  /* 3. Check the operands of the operation.  The first operands are 
defined
inside the loop body. The last operand is the reduction variable,
which is defined by the loop-header-phi.  */

  tree vectype_out = STMT_VINFO_VECTYPE (stmt_info);
  STMT_VINFO_REDUC_VECTYPE (reduc_info) = vectype_out;
  gassign *stmt = as_a  (stmt_info->stmt);
  enum tree_code code = gimple_assign_rhs_code (stmt);
  bool lane_reduc_code_p
= (code == DOT_PROD_EXPR || code == WIDEN_SUM_EXPR || code == 
SAD_EXPR);

so just add

  enum optab_subtype optab_query_kind = optab_vector;
  if (code == DOT_PROD_EXPR
  && )
optab_query_kind = optab_vector_mixed_sign;

in this place and avoid adding the new function?

I'm not too familiar with the pattern recog code, a 2nd eye would be
prefered (Richard?), but

+  /* Check if the mismatch is only in the sign and if we have
+ allow_short_sign_mismatch then allow it.  */
+  if (unprom_type
+  && TYPE_SIGN (unprom_type) == SIGNED
+  && TYPE_SIGN (*common_type) != TYPE_SIGN (new_type))
+{
+  bool sign = TYPE_SIGN (*common_type) == UNSIGNED;
+  tree eq_type
+   = build_nonstandard_integer_type (TYPE_PRECISION (new_type),
+ sign);
+
+  if (types_compatible_p (*common_type, eq_type))
+   return true;
+}

looks somewhat complicated - is that equal to

  if (unprom_type
  && tree_nop_conversion_p (*common_type, new_type))
return true;

?  That is, *common_type and new_type only differ in sign?

@@ -812,8 +844,13 @@ vect_convert_inputs (vec_info *vinfo, stmt_vec_info 
stmt_info, unsigned int n,
   for (j = 0; j < i; ++j)
if (unprom[j].op == unprom[i].op)
  break;
+  bool only_sign = allow_short_sign_mismatch
+  && TYPE_SIGN (type) != TYPE_SIGN (unprom[i].type)
+  && TYPE_PRECISION (type) == TYPE_PRECISION 
(unprom[i].type);

this could use the same tree_nop_conversion_p predicate.

Otherwise the patch looks good.

Thanks,
Richard.



> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * optabs.def (usdot_prod_optab): New.
>   * doc/md.texi: Document it and clarify other dot prod optabs.
>   * optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
>   * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
>   * optabs.c (expand_widen_pattern_expr): Likewise.
>   * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>   * tree-vect-loop.c (vect_determine_dot_kind): New.
>   (vectorizable_reduction): Query dot-product kind.
>   * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
>   optab subtype.
>   (vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
>   mismatch types.
>   (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> 
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Monday, May 10, 2021 2:29 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > where the sign for the multiplicant changes.
> > 
> > On Mon, 10 May 2021, Tamar Christina wrote:
> > 
> > >
> > >
> > > > -Original Message-
> > > > From: Richard Biener 
> 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-25 Thread Tamar Christina via Gcc-patches
Hi Richi,

Here's a respun version of the patch.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* optabs.def (usdot_prod_optab): New.
* doc/md.texi: Document it and clarify other dot prod optabs.
* optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
* optabs.c (expand_widen_pattern_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-vect-loop.c (vect_determine_dot_kind): New.
(vectorizable_reduction): Query dot-product kind.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
optab subtype.
(vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
mismatch types.
(vect_recog_dot_prod_pattern): Support usdot_prod_optab.


> -Original Message-
> From: Richard Biener 
> Sent: Monday, May 10, 2021 2:29 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> On Mon, 10 May 2021, Tamar Christina wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Monday, May 10, 2021 12:40 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > > where the sign for the multiplicant changes.
> > >
> > > On Fri, 7 May 2021, Tamar Christina wrote:
> > >
> > > > Hi Richi,
> > > >
> > > > > -Original Message-
> > > > > From: Richard Biener 
> > > > > Sent: Friday, May 7, 2021 12:46 PM
> > > > > To: Tamar Christina 
> > > > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > > > Subject: Re: [PATCH 1/4]middle-end Vect: Add support for
> > > > > dot-product where the sign for the multiplicant changes.
> > > > >
> > > > > On Wed, 5 May 2021, Tamar Christina wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > This patch adds support for a dot product where the sign of
> > > > > > the multiplication arguments differ. i.e. one is signed and
> > > > > > one is unsigned but the precisions are the same.
> > > > > >
> > > > > > #define N 480
> > > > > > #define SIGNEDNESS_1 unsigned
> > > > > > #define SIGNEDNESS_2 signed
> > > > > > #define SIGNEDNESS_3 signed
> > > > > > #define SIGNEDNESS_4 unsigned
> > > > > >
> > > > > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int
> > > > > > res,
> > > > > > SIGNEDNESS_3 char *restrict a,
> > > > > >SIGNEDNESS_4 char *restrict b) {
> > > > > >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > > > > > {
> > > > > >   int av = a[i];
> > > > > >   int bv = b[i];
> > > > > >   SIGNEDNESS_2 short mult = av * bv;
> > > > > >   res += mult;
> > > > > > }
> > > > > >   return res;
> > > > > > }
> > > > > >
> > > > > > The operations are performed as if the operands were extended
> > > > > > to a 32-bit
> > > > > value.
> > > > > > As such this operation isn't valid if there is an intermediate
> > > > > > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> > > > > >
> > > > > > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are
> > > > > > flipped the same optab is used but the operands are flipped in
> > > > > > the optab
> > > > > expansion.
> > > > > >
> > > > > > To support this the patch extends the dot-product detection to
> > > > > > optionally ignore operands with different signs and stores
> > > > > > this information in the optab subtype which is now made a bitfield.
> > > > > >
> > > > > > The subtype can now additionally controls which optab an EXPR
> > > > > > can expand
> > > > > to.
> > > > > >
> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > > > >
> > > > > > Ok for master?
> > > > > >
> > > > > > Thanks,
> > > > > > Tamar
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > > * optabs.def (usdot_prod_optab): New.
> > > > > > * doc/md.texi: Document it.
> > > > > > * optabs-tree.c (optab_for_tree_code): Support
> usdot_prod_optab.
> > > > > > * optabs-tree.h (enum optab_subtype): Likewise.
> > > > > > * optabs.c (expand_widen_pattern_expr): Likewise.
> > > > > > * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > > > > > * tree-vect-loop.c (vect_determine_dot_kind): New.
> > > > > > (vectorizable_reduction): Query dot-product kind.
> > > > > > * tree-vect-patterns.c (vect_supportable_direct_optab_p):
> > > > > > Take
> > > > > optional
> > > > > > optab subtype.
> > > > > > (vect_joust_widened_type, vect_widened_op_tree):
> Optionally
> > > > > ignore
> > > > > > mismatch types.
> > > > > > (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> > > > > >
> > > > > > --- inline copy of patch --
> > > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> > 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-10 Thread Richard Biener
On Mon, 10 May 2021, Tamar Christina wrote:

> 
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Monday, May 10, 2021 12:40 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > where the sign for the multiplicant changes.
> > 
> > On Fri, 7 May 2021, Tamar Christina wrote:
> > 
> > > Hi Richi,
> > >
> > > > -Original Message-
> > > > From: Richard Biener 
> > > > Sent: Friday, May 7, 2021 12:46 PM
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > > Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > > > where the sign for the multiplicant changes.
> > > >
> > > > On Wed, 5 May 2021, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > This patch adds support for a dot product where the sign of the
> > > > > multiplication arguments differ. i.e. one is signed and one is
> > > > > unsigned but the precisions are the same.
> > > > >
> > > > > #define N 480
> > > > > #define SIGNEDNESS_1 unsigned
> > > > > #define SIGNEDNESS_2 signed
> > > > > #define SIGNEDNESS_3 signed
> > > > > #define SIGNEDNESS_4 unsigned
> > > > >
> > > > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > > > > SIGNEDNESS_3 char *restrict a,
> > > > >SIGNEDNESS_4 char *restrict b)
> > > > > {
> > > > >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > > > > {
> > > > >   int av = a[i];
> > > > >   int bv = b[i];
> > > > >   SIGNEDNESS_2 short mult = av * bv;
> > > > >   res += mult;
> > > > > }
> > > > >   return res;
> > > > > }
> > > > >
> > > > > The operations are performed as if the operands were extended to a
> > > > > 32-bit
> > > > value.
> > > > > As such this operation isn't valid if there is an intermediate
> > > > > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> > > > >
> > > > > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are
> > > > > flipped the same optab is used but the operands are flipped in the
> > > > > optab
> > > > expansion.
> > > > >
> > > > > To support this the patch extends the dot-product detection to
> > > > > optionally ignore operands with different signs and stores this
> > > > > information in the optab subtype which is now made a bitfield.
> > > > >
> > > > > The subtype can now additionally controls which optab an EXPR can
> > > > > expand
> > > > to.
> > > > >
> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > > >
> > > > > Ok for master?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >   * optabs.def (usdot_prod_optab): New.
> > > > >   * doc/md.texi: Document it.
> > > > >   * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > > > >   * optabs-tree.h (enum optab_subtype): Likewise.
> > > > >   * optabs.c (expand_widen_pattern_expr): Likewise.
> > > > >   * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > > > >   * tree-vect-loop.c (vect_determine_dot_kind): New.
> > > > >   (vectorizable_reduction): Query dot-product kind.
> > > > >   * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> > > > optional
> > > > >   optab subtype.
> > > > >   (vect_joust_widened_type, vect_widened_op_tree): Optionally
> > > > ignore
> > > > >   mismatch types.
> > > > >   (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> > > > >
> > > > > --- inline copy of patch --
> > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> > > > >
> > > >
> > d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fd
> > > > f2
> > > > > e66bc80d7d23 100644
> > > > > --- a/gcc/doc/md.texi
> > > > > +++ b/gcc/doc/md.texi
> > > > > @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but
> > > > takes
> > > > > an additional mask operand  @item @samp{sdot_prod@var{m}}
> > @cindex
> > > > > @code{udot_prod@var{m}} instruction pattern  @itemx
> > > > > @samp{udot_prod@var{m}}
> > > > > +@cindex @code{usdot_prod@var{m}} instruction pattern @itemx
> > > > > +@samp{usdot_prod@var{m}}
> > > > >  Compute the sum of the products of two signed/unsigned elements.
> > > > > -Operand 1 and operand 2 are of the same mode. Their product,
> > > > > which is of a -wider mode, is computed and added to operand 3.
> > > > > Operand 3 is of a mode equal or -wider than the mode of the
> > > > > product. The result is placed in operand 0, which -is of the same mode
> > as operand 3.
> > > > > +Operand 1 and operand 2 are of the same mode but may differ in
> > signs.
> > > > > +Their product, which is of a wider mode, is computed and added to
> > > > operand 3.
> > > > > +Operand 3 is of a mode equal or wider than the mode of the product.
> > > > > +The result is placed in operand 0, which is of the same mode as
> > operand 3.
> > > >
> > > > This doesn't really say what the 's', 'u' and 'us' specify.  

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-10 Thread Tamar Christina via Gcc-patches


> -Original Message-
> From: Richard Biener 
> Sent: Monday, May 10, 2021 12:40 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> On Fri, 7 May 2021, Tamar Christina wrote:
> 
> > Hi Richi,
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Friday, May 7, 2021 12:46 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > > where the sign for the multiplicant changes.
> > >
> > > On Wed, 5 May 2021, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This patch adds support for a dot product where the sign of the
> > > > multiplication arguments differ. i.e. one is signed and one is
> > > > unsigned but the precisions are the same.
> > > >
> > > > #define N 480
> > > > #define SIGNEDNESS_1 unsigned
> > > > #define SIGNEDNESS_2 signed
> > > > #define SIGNEDNESS_3 signed
> > > > #define SIGNEDNESS_4 unsigned
> > > >
> > > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > > > SIGNEDNESS_3 char *restrict a,
> > > >SIGNEDNESS_4 char *restrict b)
> > > > {
> > > >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > > > {
> > > >   int av = a[i];
> > > >   int bv = b[i];
> > > >   SIGNEDNESS_2 short mult = av * bv;
> > > >   res += mult;
> > > > }
> > > >   return res;
> > > > }
> > > >
> > > > The operations are performed as if the operands were extended to a
> > > > 32-bit
> > > value.
> > > > As such this operation isn't valid if there is an intermediate
> > > > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> > > >
> > > > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are
> > > > flipped the same optab is used but the operands are flipped in the
> > > > optab
> > > expansion.
> > > >
> > > > To support this the patch extends the dot-product detection to
> > > > optionally ignore operands with different signs and stores this
> > > > information in the optab subtype which is now made a bitfield.
> > > >
> > > > The subtype can now additionally controls which optab an EXPR can
> > > > expand
> > > to.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * optabs.def (usdot_prod_optab): New.
> > > > * doc/md.texi: Document it.
> > > > * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > > > * optabs-tree.h (enum optab_subtype): Likewise.
> > > > * optabs.c (expand_widen_pattern_expr): Likewise.
> > > > * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > > > * tree-vect-loop.c (vect_determine_dot_kind): New.
> > > > (vectorizable_reduction): Query dot-product kind.
> > > > * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> > > optional
> > > > optab subtype.
> > > > (vect_joust_widened_type, vect_widened_op_tree): Optionally
> > > ignore
> > > > mismatch types.
> > > > (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> > > >
> > > > --- inline copy of patch --
> > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> > > >
> > >
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fd
> > > f2
> > > > e66bc80d7d23 100644
> > > > --- a/gcc/doc/md.texi
> > > > +++ b/gcc/doc/md.texi
> > > > @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but
> > > takes
> > > > an additional mask operand  @item @samp{sdot_prod@var{m}}
> @cindex
> > > > @code{udot_prod@var{m}} instruction pattern  @itemx
> > > > @samp{udot_prod@var{m}}
> > > > +@cindex @code{usdot_prod@var{m}} instruction pattern @itemx
> > > > +@samp{usdot_prod@var{m}}
> > > >  Compute the sum of the products of two signed/unsigned elements.
> > > > -Operand 1 and operand 2 are of the same mode. Their product,
> > > > which is of a -wider mode, is computed and added to operand 3.
> > > > Operand 3 is of a mode equal or -wider than the mode of the
> > > > product. The result is placed in operand 0, which -is of the same mode
> as operand 3.
> > > > +Operand 1 and operand 2 are of the same mode but may differ in
> signs.
> > > > +Their product, which is of a wider mode, is computed and added to
> > > operand 3.
> > > > +Operand 3 is of a mode equal or wider than the mode of the product.
> > > > +The result is placed in operand 0, which is of the same mode as
> operand 3.
> > >
> > > This doesn't really say what the 's', 'u' and 'us' specify.  Since
> > > we're doing a widen multiplication and then a non-widening addition
> > > we only need to know the effective sign of the multiplication so I think
> the existing 's' and 'u'
> > > are enough to cover all cases?
> >
> > The existing 's' and 'u' enforce that both 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-10 Thread Richard Biener
On Fri, 7 May 2021, Tamar Christina wrote:

> Hi Richi,
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Friday, May 7, 2021 12:46 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > where the sign for the multiplicant changes.
> > 
> > On Wed, 5 May 2021, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This patch adds support for a dot product where the sign of the
> > > multiplication arguments differ. i.e. one is signed and one is
> > > unsigned but the precisions are the same.
> > >
> > > #define N 480
> > > #define SIGNEDNESS_1 unsigned
> > > #define SIGNEDNESS_2 signed
> > > #define SIGNEDNESS_3 signed
> > > #define SIGNEDNESS_4 unsigned
> > >
> > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > > SIGNEDNESS_3 char *restrict a,
> > >SIGNEDNESS_4 char *restrict b)
> > > {
> > >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > > {
> > >   int av = a[i];
> > >   int bv = b[i];
> > >   SIGNEDNESS_2 short mult = av * bv;
> > >   res += mult;
> > > }
> > >   return res;
> > > }
> > >
> > > The operations are performed as if the operands were extended to a 32-bit
> > value.
> > > As such this operation isn't valid if there is an intermediate
> > > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> > >
> > > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped
> > > the same optab is used but the operands are flipped in the optab
> > expansion.
> > >
> > > To support this the patch extends the dot-product detection to
> > > optionally ignore operands with different signs and stores this
> > > information in the optab subtype which is now made a bitfield.
> > >
> > > The subtype can now additionally controls which optab an EXPR can expand
> > to.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * optabs.def (usdot_prod_optab): New.
> > >   * doc/md.texi: Document it.
> > >   * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > >   * optabs-tree.h (enum optab_subtype): Likewise.
> > >   * optabs.c (expand_widen_pattern_expr): Likewise.
> > >   * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > >   * tree-vect-loop.c (vect_determine_dot_kind): New.
> > >   (vectorizable_reduction): Query dot-product kind.
> > >   * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> > optional
> > >   optab subtype.
> > >   (vect_joust_widened_type, vect_widened_op_tree): Optionally
> > ignore
> > >   mismatch types.
> > >   (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> > >
> > d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fd
> > f2
> > > e66bc80d7d23 100644
> > > --- a/gcc/doc/md.texi
> > > +++ b/gcc/doc/md.texi
> > > @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but
> > takes
> > > an additional mask operand  @item @samp{sdot_prod@var{m}}  @cindex
> > > @code{udot_prod@var{m}} instruction pattern  @itemx
> > > @samp{udot_prod@var{m}}
> > > +@cindex @code{usdot_prod@var{m}} instruction pattern @itemx
> > > +@samp{usdot_prod@var{m}}
> > >  Compute the sum of the products of two signed/unsigned elements.
> > > -Operand 1 and operand 2 are of the same mode. Their product, which is
> > > of a -wider mode, is computed and added to operand 3. Operand 3 is of
> > > a mode equal or -wider than the mode of the product. The result is
> > > placed in operand 0, which -is of the same mode as operand 3.
> > > +Operand 1 and operand 2 are of the same mode but may differ in signs.
> > > +Their product, which is of a wider mode, is computed and added to
> > operand 3.
> > > +Operand 3 is of a mode equal or wider than the mode of the product.
> > > +The result is placed in operand 0, which is of the same mode as operand 
> > > 3.
> > 
> > This doesn't really say what the 's', 'u' and 'us' specify.  Since we're 
> > doing a
> > widen multiplication and then a non-widening addition we only need to
> > know the effective sign of the multiplication so I think the existing 's' 
> > and 'u'
> > are enough to cover all cases?
> 
> The existing 's' and 'u' enforce that both operands of the multiplication are 
> of the
> same sign.  So for e.g. 'u' both operand must be unsigned.
> 
> In the `us` case one can be signed and one unsigned. Operationally this does 
> a sign
> extension to the wider type for the signed value, and the unsigned value gets 
> zero extended
> first, and then converts it to unsigned to perform the
> unsigned multiplication, conforming to the C promotion rules.
> 
> TL;DR; Without a new optab I can't tell during expansion which semantic the 
> operation
> had at the gimple/C level as modes don't carry signs.
> 
> Long version:
> 
> 

RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-07 Thread Tamar Christina via Gcc-patches
Hi Richi,

> -Original Message-
> From: Richard Biener 
> Sent: Friday, May 7, 2021 12:46 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> On Wed, 5 May 2021, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This patch adds support for a dot product where the sign of the
> > multiplication arguments differ. i.e. one is signed and one is
> > unsigned but the precisions are the same.
> >
> > #define N 480
> > #define SIGNEDNESS_1 unsigned
> > #define SIGNEDNESS_2 signed
> > #define SIGNEDNESS_3 signed
> > #define SIGNEDNESS_4 unsigned
> >
> > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > SIGNEDNESS_3 char *restrict a,
> >SIGNEDNESS_4 char *restrict b)
> > {
> >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > {
> >   int av = a[i];
> >   int bv = b[i];
> >   SIGNEDNESS_2 short mult = av * bv;
> >   res += mult;
> > }
> >   return res;
> > }
> >
> > The operations are performed as if the operands were extended to a 32-bit
> value.
> > As such this operation isn't valid if there is an intermediate
> > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> >
> > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped
> > the same optab is used but the operands are flipped in the optab
> expansion.
> >
> > To support this the patch extends the dot-product detection to
> > optionally ignore operands with different signs and stores this
> > information in the optab subtype which is now made a bitfield.
> >
> > The subtype can now additionally controls which optab an EXPR can expand
> to.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * optabs.def (usdot_prod_optab): New.
> > * doc/md.texi: Document it.
> > * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > * optabs-tree.h (enum optab_subtype): Likewise.
> > * optabs.c (expand_widen_pattern_expr): Likewise.
> > * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > * tree-vect-loop.c (vect_determine_dot_kind): New.
> > (vectorizable_reduction): Query dot-product kind.
> > * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> optional
> > optab subtype.
> > (vect_joust_widened_type, vect_widened_op_tree): Optionally
> ignore
> > mismatch types.
> > (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> >
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fd
> f2
> > e66bc80d7d23 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but
> takes
> > an additional mask operand  @item @samp{sdot_prod@var{m}}  @cindex
> > @code{udot_prod@var{m}} instruction pattern  @itemx
> > @samp{udot_prod@var{m}}
> > +@cindex @code{usdot_prod@var{m}} instruction pattern @itemx
> > +@samp{usdot_prod@var{m}}
> >  Compute the sum of the products of two signed/unsigned elements.
> > -Operand 1 and operand 2 are of the same mode. Their product, which is
> > of a -wider mode, is computed and added to operand 3. Operand 3 is of
> > a mode equal or -wider than the mode of the product. The result is
> > placed in operand 0, which -is of the same mode as operand 3.
> > +Operand 1 and operand 2 are of the same mode but may differ in signs.
> > +Their product, which is of a wider mode, is computed and added to
> operand 3.
> > +Operand 3 is of a mode equal or wider than the mode of the product.
> > +The result is placed in operand 0, which is of the same mode as operand 3.
> 
> This doesn't really say what the 's', 'u' and 'us' specify.  Since we're 
> doing a
> widen multiplication and then a non-widening addition we only need to
> know the effective sign of the multiplication so I think the existing 's' and 
> 'u'
> are enough to cover all cases?

The existing 's' and 'u' enforce that both operands of the multiplication are 
of the
same sign.  So for e.g. 'u' both operand must be unsigned.

In the `us` case one can be signed and one unsigned. Operationally this does a 
sign
extension to the wider type for the signed value, and the unsigned value gets 
zero extended
first, and then converts it to unsigned to perform the
unsigned multiplication, conforming to the C promotion rules.

TL;DR; Without a new optab I can't tell during expansion which semantic the 
operation
had at the gimple/C level as modes don't carry signs.

Long version:

The problem with using the existing patterns, because of their enforcement of 
`av` and `bv` being
the same sign is that we can't remove the explicit sign extensions, but the 
multiplication must be done
on the sign/zero extended char input in the same sign.

Which means (unless I am 

Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-07 Thread Richard Biener
On Wed, 5 May 2021, Tamar Christina wrote:

> Hi All,
> 
> This patch adds support for a dot product where the sign of the multiplication
> arguments differ. i.e. one is signed and one is unsigned but the precisions 
> are
> the same.
> 
> #define N 480
> #define SIGNEDNESS_1 unsigned
> #define SIGNEDNESS_2 signed
> #define SIGNEDNESS_3 signed
> #define SIGNEDNESS_4 unsigned
> 
> SIGNEDNESS_1 int __attribute__ ((noipa))
> f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
>SIGNEDNESS_4 char *restrict b)
> {
>   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> {
>   int av = a[i];
>   int bv = b[i];
>   SIGNEDNESS_2 short mult = av * bv;
>   res += mult;
> }
>   return res;
> }
> 
> The operations are performed as if the operands were extended to a 32-bit 
> value.
> As such this operation isn't valid if there is an intermediate conversion to 
> an
> unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> 
> more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped the same
> optab is used but the operands are flipped in the optab expansion.
> 
> To support this the patch extends the dot-product detection to optionally
> ignore operands with different signs and stores this information in the optab
> subtype which is now made a bitfield.
> 
> The subtype can now additionally controls which optab an EXPR can expand to.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * optabs.def (usdot_prod_optab): New.
>   * doc/md.texi: Document it.
>   * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
>   * optabs-tree.h (enum optab_subtype): Likewise.
>   * optabs.c (expand_widen_pattern_expr): Likewise.
>   * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>   * tree-vect-loop.c (vect_determine_dot_kind): New.
>   (vectorizable_reduction): Query dot-product kind.
>   * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
>   optab subtype.
>   (vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
>   mismatch types.
>   (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fdf2e66bc80d7d23
>  100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but takes an 
> additional mask operand
>  @item @samp{sdot_prod@var{m}}
>  @cindex @code{udot_prod@var{m}} instruction pattern
>  @itemx @samp{udot_prod@var{m}}
> +@cindex @code{usdot_prod@var{m}} instruction pattern
> +@itemx @samp{usdot_prod@var{m}}
>  Compute the sum of the products of two signed/unsigned elements.
> -Operand 1 and operand 2 are of the same mode. Their product, which is of a
> -wider mode, is computed and added to operand 3. Operand 3 is of a mode equal 
> or
> -wider than the mode of the product. The result is placed in operand 0, which
> -is of the same mode as operand 3.
> +Operand 1 and operand 2 are of the same mode but may differ in signs. Their
> +product, which is of a wider mode, is computed and added to operand 3.
> +Operand 3 is of a mode equal or wider than the mode of the product. The
> +result is placed in operand 0, which is of the same mode as operand 3.

This doesn't really say what the 's', 'u' and 'us' specify.  Since
we're doing a widen multiplication and then a non-widening addition
we only need to know the effective sign of the multiplication so
I think the existing 's' and 'u' are enough to cover all cases?

The tree.def docs say the sum is also possibly widening but I don't see
this covered by the optab so we should eventually remove this
feature from the tree side.  In fact the tree-cfg.c verifier requires
the addition to be not widening - thus only tree.def needs adjustment.

>  @cindex @code{ssad@var{m}} instruction pattern
>  @item @samp{ssad@var{m}}
> diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
> index 
> c3aaa1a416991e856d3e24da45968a92ebada82c..ebc23ac86fe99057f375781c2f1990e0548ba08d
>  100644
> --- a/gcc/optabs-tree.h
> +++ b/gcc/optabs-tree.h
> @@ -27,11 +27,29 @@ along with GCC; see the file COPYING3.  If not see
> shift amount vs. machines that take a vector for the shift amount.  */
>  enum optab_subtype
>  {
> -  optab_default,
> -  optab_scalar,
> -  optab_vector
> +  optab_default = 1 << 0,
> +  optab_scalar = 1 << 1,
> +  optab_vector = 1 << 2,
> +  optab_signed_to_unsigned = 1 << 3,
> +  optab_unsigned_to_signed = 1 << 4
>  };
>  
> +/* Override the OrEqual-operator so we can use optab_subtype as a bit flag.  
> */
> +inline enum optab_subtype&
> +operator |= (enum optab_subtype& a, enum optab_subtype b)
> +{
> +return a = static_cast(static_cast(a)
> +   | static_cast(b));
> +}
> +
> +/* Override the Or-operator so we can