> I don't think we handle
>
> mem = foo ();
Hmm, we can't
struct val {int a,b;};
[[gnu::noinline]]
struct val dup(int *a)
{
a[0]=a[1];
struct val ret;
ret.a = a[2];
ret.b = a[3];
return ret;
}
int
test (int *b)
{
struct val ret = dup (b);
struct val ret2 = dup (b);
return ret.a + ret.b + ret2.a + ret2.b;
}
Also for
struct val {int a,b;} v,v1,v2;
void
test ()
{
v1=v;
v2=v;
if (v1.a != v2.a)
__builtin_abort ();
}
We stil have in optimized dump:
void test ()
{
int _1;
int _2;
<bb 2> [local count: 1073741824]:
v1 = v;
v2 = v;
_1 = v1.a;
_2 = v2.a;
if (_1 != _2)
goto <bb 3>; [0.00%]
else
goto <bb 4>; [100.00%]
<bb 3> [count: 0]:
__builtin_abort ();
<bb 4> [local count: 1073741824]:
return;
}
We eventually get rid of abort in combine, but that is truly late.
I think it only gets optimized because val is small structure. For
bigger structure we would resort to memcpy and RTL optimizers would give
up too.
I tought VN is able to handle this by assigning value numbers for store
destinations, but I see it only happens since most stores have VN for
their RHS.
> correctly in VN, the && lhs condition also guards this. Maybe
> instead refactor this and the check a few lines above to check
> (!lhs || TREE_CODE (lhs) == SSA_NAME)
OK, I tought this would be way too easy :)
so we need SSA lhs + we need to check that memory defined by first call
is not modified in between.
>
> ? The VUSE->VDEF chain walking also doesn't consider the call
> having memory side-effects since it effectively skips intermittent
> uses. So I believe you have to adjust (or guard) that as well,
> alternatively visit all uses for each def walked.
>
> > && gimple_vuse (stmt)
> > && (((summary = get_modref_function_summary (stmt, NULL))
> > && !summary->global_memory_read
> > @@ -6354,19 +6352,18 @@ visit_stmt (gimple *stmt, bool backedges_varying_p
> > = false)
> >
> > /* Pick up flags from a devirtualization target. */
> > tree fn = gimple_call_fn (stmt);
> > - int extra_fnflags = 0;
> > if (fn && TREE_CODE (fn) == SSA_NAME)
> > {
> > fn = SSA_VAL (fn);
> > if (TREE_CODE (fn) == ADDR_EXPR
> > && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL)
> > - extra_fnflags = flags_from_decl_or_type (TREE_OPERAND (fn, 0));
> > + fn = TREE_OPERAND (fn, 0);
> > + else
> > + fn = NULL;
> > }
> > - if ((/* Calls to the same function with the same vuse
> > - and the same operands do not necessarily return the same
> > - value, unless they're pure or const. */
> > - ((gimple_call_flags (call_stmt) | extra_fnflags)
> > - & (ECF_PURE | ECF_CONST))
> > + else
> > + fn = NULL;
> > + if ((ipa_modref_can_remove_redundat_calls (call_stmt, fn)
> > /* If calls have a vdef, subsequent calls won't have
> > the same incoming vuse. So, if 2 calls with vdef have the
> > same vuse, we know they're not subsequent.
>
> With disregarding VDEF this comment is now wrong (it's directed at
> tail-merging btw).
>
> I'll note that elimination will only be able to "DCE" calls with a
> LHS since "DCE" happens by replacing the RHS. That's also what the
> && lhs check is about - we don't do anything useful during elimination
> when there's no LHS but the call itself is present in the expression
> hash.
It would be nice to handle this, since a lot of code in C "returns"
agregates by output parameter.
Note that Jens proposed to fill defect report to C23 to allow removal of
reproducible/unsequenced calls which would make it possible to DCE/DSE them
as well. This looks like a good idea to me.
Honza
>
> Richard.