https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118217
Bug ID: 118217
Summary: Dot-product for square on difference of two small type
integers
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: fxue at os dot amperecomputing.com
Target Milestone: ---
Consider a case:
int foo(const signed char *a, const signed char *b, int n)
{
int sum = 0;
for (int i = 0; i < n; ++i) {
int diff = a[i] - b[i];
sum += diff * diff;
}
return sum;
}
In the case, "diff" is only referenced in a square expression. For architecture
that has absolute difference instruction(IFN_ABD), such as aarch64, we could
think that there is a hidden abs() around "diff", which ends up with equivalent
result as original, in that abs(diff) * abs(diff) = diff * diff. One advantage
of this transformation is that we could compute abs(diff) with ABD instruction,
at the same time, keeps the result as the same width with two operands, and
this exposes an opportunity to generate a more compact dot-product to avoid
type-conversions, then code-gen could be as:
int foo(const signed char *a, const signed char *b, int n)
{
int sum = 0;
for (int i = 0; i < n; i += 16) {
vector(16) signed char v_a = *(vector(16) signed char *)(&a[i]);
vector(16) signed char v_b = *(vector(16) signed char *)(&b[i]);
vector(16) unsigned char v_diff = IFN_ABD(v_a, v_b);
v_sum += DOT_PROD_EXPR(v_diff, v_diff, v_sum);
}
return .REDUC_PLUS(v_sum);
}