https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118727
--- Comment #17 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Xi Ruoyao from comment #16)
> (In reply to Tamar Christina from comment #15)
> > (In reply to Xi Ruoyao from comment #13)
> > > For example for the original gcc.dg/pr108692.c:
> > >
> > > a.0_4 = (unsigned char) a_14;
> > > _5 = (int) a.0_4;
> > > b.1_6 = (unsigned char) b_16;
> > > _7 = (int) b.1_6;
> > > c_17 = _5 - _7;
> > > _8 = ABS_EXPR <c_17>;
> > > r_18 = _8 + r_23;
> > >
> > > becomes:
> > >
> > > patt_31 = .ABD (a.0_4, b.1_6);
> > >
> > > so when we get the ABD we already stripped the redundant (int) conversion
> > > here, note that the (unsigned char) conversion is really needed. And for
> > > my
> > > twisted test case:
> > >
> > > a.0_4 = (unsigned char) a_14;
> > > _5 = (unsigned int) a.0_4;
> > > _6 = (unsigned int) b_16;
> > > _7 = _5 - _6;
> > > c_17 = (int) _7;
> > > _8 = ABS_EXPR <c_17>;
> > >
> > > becomes:
> > >
> > > patt_38 = (unsigned short) a.0_4;
> > > patt_35 = (signed short) b_16;
> > > patt_36 = (signed short) patt_38;
> > > patt_33 = .ABD (patt_36, patt_35);
> > >
> > > To me this is already the "optimal" way. Correct me if I'm wrong, but
> > > anyway even if there's a redundant conversion it should be stripped in
> > > vect_recog_abd_pattern instead of vect_recog_sad_pattern.
> >
> > No, that's not correct. SAD_EXPR is the most optimal version here as
> > SAD_EXPR performs double widening. It removes the need for the intermediate
> > vector widening.
> >
> > a.0_4 = (unsigned char) a_14;
> > patt_b_16 = (unsigned char) b_16
> > patt_41 = SAD_EXPR <a.0_4, patt_b_16, r_23>;
> >
> > is the most optimal form, as it does the operation on bytes rather than
> > short.
>
> I don't think it's correct... In the twisted test case we have something
> like
>
> r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b)
>
> So if a is 255 and b is 128, we should add r with abs(255 - (-128)) = 383,
> but with this form we'll add it with 127.
The operation you posted above is an unsigned operation, in the original
expression the subtract is done on zero extended bytes.
> > > so when we get the ABD we already stripped the redundant (int) conversion
> > > here, note that the (unsigned char) conversion is really needed. And for
> > > my
> > > twisted test case:
> > >
> > > a.0_4 = (unsigned char) a_14;
> > > _5 = (unsigned int) a.0_4;
> > > _6 = (unsigned int) b_16;
> > > _7 = _5 - _6;
> > > c_17 = (int) _7;
(In reply to Xi Ruoyao from comment #16)
> (In reply to Tamar Christina from comment #15)
> > (In reply to Xi Ruoyao from comment #13)
> > > For example for the original gcc.dg/pr108692.c:
> > >
> > > a.0_4 = (unsigned char) a_14;
> > > _5 = (int) a.0_4;
> > > b.1_6 = (unsigned char) b_16;
> > > _7 = (int) b.1_6;
> > > c_17 = _5 - _7;
> > > _8 = ABS_EXPR <c_17>;
> > > r_18 = _8 + r_23;
> > >
> > > becomes:
> > >
> > > patt_31 = .ABD (a.0_4, b.1_6);
> > >
> > > so when we get the ABD we already stripped the redundant (int) conversion
> > > here, note that the (unsigned char) conversion is really needed. And for
> > > my
> > > twisted test case:
> > >
> > > a.0_4 = (unsigned char) a_14;
> > > _5 = (unsigned int) a.0_4;
> > > _6 = (unsigned int) b_16;
> > > _7 = _5 - _6;
> > > c_17 = (int) _7;
> > > _8 = ABS_EXPR <c_17>;
> > >
> > > becomes:
> > >
> > > patt_38 = (unsigned short) a.0_4;
> > > patt_35 = (signed short) b_16;
> > > patt_36 = (signed short) patt_38;
> > > patt_33 = .ABD (patt_36, patt_35);
> > >
> > > To me this is already the "optimal" way. Correct me if I'm wrong, but
> > > anyway even if there's a redundant conversion it should be stripped in
> > > vect_recog_abd_pattern instead of vect_recog_sad_pattern.
> >
> > No, that's not correct. SAD_EXPR is the most optimal version here as
> > SAD_EXPR performs double widening. It removes the need for the intermediate
> > vector widening.
> >
> > a.0_4 = (unsigned char) a_14;
> > patt_b_16 = (unsigned char) b_16
> > patt_41 = SAD_EXPR <a.0_4, patt_b_16, r_23>;
> >
> > is the most optimal form, as it does the operation on bytes rather than
> > short.
>
> I don't think it's correct... In the twisted test case we have something
> like
>
> r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b)
>
> So if a is 255 and b is 128, we should add r with abs(255 - (-128)) = 383,
> but with this form we'll add it with 127.
Ah right, because b didn't have an explicit cast to unsigned char first and
it's unsigned in this case.
I was however arguing about:
int
foo (unsigned char *x, unsigned char *y, int n)
{
int i, r = 0;
unsigned char a, b;
for (i = 0; i < n; i++)
{
a = x[i];
b = y[i];
int c = (unsigned char) a - (unsigned char) b;
r = r + (c < 0 ? -c : c);
}
return r;
}
where vect_recog_over_widening_pattern will split the zero extension into:
/app/example.c:6:17: note: Splitting statement: _4 = (int) a_12;
/app/example.c:6:17: note: into pattern statements: patt_43 = (unsigned
short) a_12;
/app/example.c:6:17: note: and: patt_42 = (int) patt_43;
and so SAD_EXPR needs to recover from the intermediate split.
However you're right that ABD already does this correctly because
vect_recog_absolute_difference handles it through vect_widened_op_tree.
So indeed I agree that the calls are redundant for vect_recog_sad_pattern