On Mon, 18 Nov 2024, Jan Hubicka wrote:
> > I don't think we handle
> >
> > mem = foo ();
>
> Hmm, we can't
> struct val {int a,b;};
>
> [[gnu::noinline]]
> struct val dup(int *a)
> {
> a[0]=a[1];
> struct val ret;
> ret.a = a[2];
> ret.b = a[3];
> return ret;
> }
> int
> test (int *b)
> {
> struct val ret = dup (b);
> struct val ret2 = dup (b);
> return ret.a + ret.b + ret2.a + ret2.b;
> }
>
> Also for
>
> struct val {int a,b;} v,v1,v2;
>
> void
> test ()
> {
> v1=v;
> v2=v;
> if (v1.a != v2.a)
> __builtin_abort ();
> }
>
> We stil have in optimized dump:
>
> void test ()
> {
> int _1;
> int _2;
>
> <bb 2> [local count: 1073741824]:
> v1 = v;
> v2 = v;
> _1 = v1.a;
> _2 = v2.a;
> if (_1 != _2)
> goto <bb 3>; [0.00%]
> else
> goto <bb 4>; [100.00%]
>
> <bb 3> [count: 0]:
> __builtin_abort ();
>
> <bb 4> [local count: 1073741824]:
> return;
>
> }
>
> We eventually get rid of abort in combine, but that is truly late.
> I think it only gets optimized because val is small structure. For
> bigger structure we would resort to memcpy and RTL optimizers would give
> up too.
>
> I tought VN is able to handle this by assigning value numbers for store
> destinations, but I see it only happens since most stores have VN for
> their RHS.
>
> > correctly in VN, the && lhs condition also guards this. Maybe
> > instead refactor this and the check a few lines above to check
> > (!lhs || TREE_CODE (lhs) == SSA_NAME)
>
> OK, I tought this would be way too easy :)
> so we need SSA lhs + we need to check that memory defined by first call
> is not modified in between.
> >
> > ? The VUSE->VDEF chain walking also doesn't consider the call
> > having memory side-effects since it effectively skips intermittent
> > uses. So I believe you have to adjust (or guard) that as well,
> > alternatively visit all uses for each def walked.
> >
> > > && gimple_vuse (stmt)
> > > && (((summary = get_modref_function_summary (stmt, NULL))
> > > && !summary->global_memory_read
> > > @@ -6354,19 +6352,18 @@ visit_stmt (gimple *stmt, bool
> > > backedges_varying_p = false)
> > >
> > > /* Pick up flags from a devirtualization target. */
> > > tree fn = gimple_call_fn (stmt);
> > > - int extra_fnflags = 0;
> > > if (fn && TREE_CODE (fn) == SSA_NAME)
> > > {
> > > fn = SSA_VAL (fn);
> > > if (TREE_CODE (fn) == ADDR_EXPR
> > > && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL)
> > > - extra_fnflags = flags_from_decl_or_type (TREE_OPERAND (fn, 0));
> > > + fn = TREE_OPERAND (fn, 0);
> > > + else
> > > + fn = NULL;
> > > }
> > > - if ((/* Calls to the same function with the same vuse
> > > - and the same operands do not necessarily return the same
> > > - value, unless they're pure or const. */
> > > - ((gimple_call_flags (call_stmt) | extra_fnflags)
> > > - & (ECF_PURE | ECF_CONST))
> > > + else
> > > + fn = NULL;
> > > + if ((ipa_modref_can_remove_redundat_calls (call_stmt, fn)
> > > /* If calls have a vdef, subsequent calls won't have
> > > the same incoming vuse. So, if 2 calls with vdef have the
> > > same vuse, we know they're not subsequent.
> >
> > With disregarding VDEF this comment is now wrong (it's directed at
> > tail-merging btw).
> >
> > I'll note that elimination will only be able to "DCE" calls with a
> > LHS since "DCE" happens by replacing the RHS. That's also what the
> > && lhs check is about - we don't do anything useful during elimination
> > when there's no LHS but the call itself is present in the expression
> > hash.
>
> It would be nice to handle this, since a lot of code in C "returns"
> agregates by output parameter.
I don't disagree - I just wanted to note your patch seems to have
(wrong-code) problems.
Richard.
> Note that Jens proposed to fill defect report to C23 to allow removal of
> reproducible/unsequenced calls which would make it possible to DCE/DSE them
> as well. This looks like a good idea to me.
>
> Honza
> >
> > Richard.
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)