Hi,
Thank you both for the reviewing. I've updated the patch and also added
a test (to the gcc.dg to avoid duplication). I'll commit the patch shortly.
Thanks,
Yufeng
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Call
has_single_use () and not do the conversion if has_single_use ()
returns false for the multiplication result.
gcc/testsuite/
* gcc.dg/wmul-1.c: New test.
On 10/23/13 10:42, Richard Biener wrote:
On Tue, Oct 22, 2013 at 12:01 AM, Yufeng Zhang<yufeng.zh...@arm.com> wrote:
Hi,
This patch changes the widening_mul pass to fuse the widening multiply with
accumulate only when the multiply has single use. The widening_mul pass
currently does the conversion regardless of the number of the uses, which
can cause poor code-gen in cases like the following:
typedef int ArrT [10][10];
void
foo (ArrT Arr, int Idx)
{
Arr[Idx][Idx] = 1;
Arr[Idx + 10][Idx] = 2;
}
On AArch64, after widening_mul, the IR is like
_2 = (long unsigned int) Idx_1(D);
_3 = Idx_1(D) w* 40;<----
_5 = Arr_4(D) + _3;
*_5[Idx_1(D)] = 1;
_8 = WIDEN_MULT_PLUS_EXPR<Idx_1(D), 40, 400>;<----
_9 = Arr_4(D) + _8;
*_9[Idx_1(D)] = 2;
Where the arrows point, there are redundant widening multiplies.
Bootstrap successfully on x86_64.
The patch passes the regtest on aarch64, arm and x86_64.
OK for the trunk?
if (!is_widening_mult_p (rhs1_stmt,&type1,&mult_rhs1,
-&type2,&mult_rhs2))
+&type2,&mult_rhs2)
+ || !has_single_use (rhs1))
please check has_single_use first, it's the cheaper check.
Ok with that change (and possibly a testcase).
Thanks,
Richard.
Thanks,
Yufeng
p.s. Note that x86_64 doesn't suffer from this issue as the corresponding
widening multiply accumulate op is not available on the target.
gcc/
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Call
has_single_use () and not do the conversion if has_single_use ()
returns false for the multiplication result.
diff --git a/gcc/testsuite/gcc.dg/wmul-1.c b/gcc/testsuite/gcc.dg/wmul-1.c
new file mode 100644
index 0000000..3e762f4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/wmul-1.c
@@ -0,0 +1,19 @@
+/* Not to fuse widening multiply with accumulate if the multiply has more than
+ one uses.
+ Note that for targets where pointer and int are of the same size or
+ widening multiply-and-accumulate is not available, this test just passes.
*/
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-widening_mul" } */
+
+typedef int ArrT [10][10];
+
+void
+foo (ArrT Arr, int Idx)
+{
+ Arr[Idx][Idx] = 1;
+ Arr[Idx + 10][Idx] = 2;
+}
+
+/* { dg-final { scan-tree-dump-not "WIDEN_MULT_PLUS_EXPR" "widening_mul" } } */
+/* { dg-final { cleanup-tree-dump "widening_mul" } } */
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index f7f8ec9..77701ae 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2425,20 +2425,25 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi,
gimple stmt,
It might also appear that it would be sufficient to use the existing
operands of the widening multiply, but that would limit the choice of
- multiply-and-accumulate instructions. */
+ multiply-and-accumulate instructions.
+
+ If the widened-multiplication result has more than one uses, it is
+ probably wiser not to do the conversion. */
if (code == PLUS_EXPR
&& (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
{
- if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
- &type2, &mult_rhs2))
+ if (!has_single_use (rhs1)
+ || !is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
+ &type2, &mult_rhs2))
return false;
add_rhs = rhs2;
conv_stmt = conv1_stmt;
}
else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
{
- if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
- &type2, &mult_rhs2))
+ if (!has_single_use (rhs2)
+ || !is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
+ &type2, &mult_rhs2))
return false;
add_rhs = rhs1;
conv_stmt = conv2_stmt;