On Thu, Sep 17, 2015 at 5:58 PM, Simon Dardis <simon.dar...@imgtec.com> wrote:
I've since taken another look at this recently and I've tracked the issue down
to
tree-predcom.c, specifically ref_at_iteration almost always generating MEM_REFs.
With MEM_REFs, GCC's RTL GCSE cannot compare them as equal and hence
remove them. A previous version of the code did generate ARRAY_REFs
(pre 204458), but that was changed to generate MEM_REFs for pr/58653.
Ok, so the issue is the full redundancy in
<bb 5>:
# count_28 = PHI <0(4), count_23(7)>
if (pretmp_42 > 1.0e+0)
goto <bb 6>;
else
goto <bb 9>;
<bb 6>:
_8 = 1.0e+0 / pretmp_42;
_12 = _8 * _8;
poly[1] = _12;
<bb 7>:
# prephitmp_30 = PHI <_12(6), _36(9)>
# T_lsm.8_22 = PHI <_8(6), pretmp_42(9)>
poly_I_lsm0.10_38 = MEM[(double *)&poly + 8B];
...
if (count_23 != iterations_6(D))
goto <bb 5>;
else
goto <bb 8>;
<bb 9>:
_36 = pretmp_42 * pretmp_42;
poly[1] = _36;
goto <bb 7>;
?
On x86_64 I have only one load from poly[1] left in cprop1, the one
using the MEM_REF. It's
(insn 31 30 32 6 (set (reg:DF 93 [ poly_I_lsm0.10 ])
(mem/c:DF (const:DI (plus:DI (symbol_ref:DI ("poly") [flags
0x2] <var_decl 0x7f6eab532e10 poly>)
(const_int 8 [0x8]))) [1 MEM[(double *)&poly +
8B]+0 S8 A64])) 126 {*movdf_internal}
(nil))
vs.
(insn 67 65 5 8 (set (mem/c:DF (const:DI (plus:DI (symbol_ref:DI
("poly") [flags 0x2] <var_decl 0x7f6eab532e10 poly>)
(const_int 8 [0x8]))) [1 poly+8 S8 A64])
(reg:DF 90)) t.c:22 126 {*movdf_internal}
for the partial redundancy across the backedge. I don't see why GCSE
should even care for the MEM_EXPR
here!
Would something like:
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -1409,7 +1409,21 @@ ref_at_iteration (data_reference_p dr, int iter,
gimple_seq *stmts)
addr, alias_ptr),
DECL_SIZE (field), bitsize_zero_node);
}
- else
+ /* Generate an ARRAY_REF for array references when all details are
INTEGER_CST
+ rather than a MEM_REF so that CSE passes can potientially optimize them.
*/
+ else if (TREE_CODE (DR_REF (dr)) == ARRAY_REF
+ && TREE_CODE (DR_STEP (dr)) == INTEGER_CST
+ && TREE_CODE (DR_INIT (dr)) == INTEGER_CST
+ && TREE_CODE (DR_OFFSET (dr)) == INTEGER_CST)
+ {
+ /* Reverse engineer the element from memory offset. */
+ tree offset = size_binop (MINUS_EXPR, coff, off);
+ tree sizdiv = TYPE_SIZE (TREE_TYPE (TREE_TYPE (DR_BASE_OBJECT (dr))));
+ sizdiv = div_if_zero_remainder (EXACT_DIV_EXPR, sizdiv, ssize_int
(BITS_PER_UNIT));
+ tree element = div_if_zero_remainder (EXACT_DIV_EXPR, offset, sizdiv);
+ if (element != NULL_TREE)
+ return build4 (ARRAY_REF, TREE_TYPE (DR_REF (dr)), DR_BASE_OBJECT
(dr),
+ element, NULL_TREE, NULL_TREE);
+ }
return fold_build2 (MEM_REF, TREE_TYPE (DR_REF (dr)), addr, alias_ptr);
be an appropriate start to fixing this? That fix appears to work in in my
testing.
No, that seems to be the wrong fix and in the light of wrong-code
issues we had with the middle-end
generating ARRAY_REFs even more so.
What we can do and what would maybe help this case is in
ref_at_iteration detect the DR_STEP == 0 case
and simply return unshare_expr (DR_REF (dr)) in that case. Well, iff
you don't want to explore the
GCSE issue...
Index: gcc/tree-predcom.c
===================================================================
--- gcc/tree-predcom.c (revision 227899)
+++ gcc/tree-predcom.c (working copy)
@@ -1386,6 +1386,10 @@ replace_ref_with (gimple stmt, tree new_
static tree
ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
{
+ /* If the reference doesn't change just return it. */
+ if (integer_zerop (DR_STEP (dr)))
+ return unshare_expr (DR_REF (dr));
+
tree off = DR_OFFSET (dr);
tree coff = DR_INIT (dr);
if (iter == 0)
hmm, no, then we end up with
# prephitmp_30 = PHI <_12(6), _36(9)>
# T_lsm.8_22 = PHI <_8(6), pretmp_42(9)>
poly_I_lsm0.10_38 = poly[1];
_24 = MEM[(double *)&poly + 8B];
beause DOM has a similar issue preventing it from CSEing those two.