https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89567

--- Comment #5 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Eyal Rozenberg from comment #4)
> > In the first excample, the interproceudral constant propagation pass
> > (IPA-CP) found that foo1 is so small that copying all of it might be
> > worth not passing the unused argument and so it does, that is why
> > you'll find function foo1 twice in the assembly. 
> 
> Why does this have anything to do with constant propagation? I also don't
> understand the sense in two identical copies.

The transformation literally removes a parameter from a function.  All
(direct) callers in the same compilation unit then call the new clone,
all indirect clones and callers from other compilation units call the
old one, with the old calling convention.  I understand that in your
simple testcase that does not matter but in others it might (and
IPA-CP is a high-level pass that does not know about physical
registers, calling conventions etc.).

> 
> It also sounds like "the wrong optimization" is being used if it's not about
> noticing unused parameters.

You can call it that way if you like to.  It was just easy to add
there, it makes a good job and has no practical disadvantages.

> 
> > This functionality
> > in the pass is there just "on the side" and it is not easy to make it
> > also work with aggegates, not even desireable (that is the job of a
> > different pass, see below).
> >
> > Both examples are compiled better if you make foo1 and foo2 static.
> 
> This really makes no sense to me! bar() is not affected by other TUs at
> all...

IPA-SRA primarily changes foo.  

> 
> > In the latter case, you get exactly what you want, the structure is be
> > split and only the used part survives.  In the first example, you
> > don't get a clone emitted which you probably don't need.  Both of
> > these transformation are done by a pass called interprocedural scalar
> > replacement of aggregates (IPA-SRA), which specifically also aims to
> > remove unused arguments, but it never creates multiple clones.
> 
> I like this pass :-) ... so, why does it work for the static case with
> bar2() but doesn't work with bar1() ?

I don't understand your question, just make foo1 and/or foo2 static
and it will trigger.   The pass needs to adjust all callers and
therefore only works on static functions because otherwise there may
be other call in other compilation units.

> 
> > I'm afraid you'd need to provide a strong real-world use-case to make
> > me investigate how to make IPA-SRA clone so you might not need static
> > and/or LTO because that would mean devising a cost/benefit
> > (size/speedup) heuristics and that is not easy.
> 
> For now I'm just trying to understand why this isn't already happening. Then
> I'll perhaps try to understand why clang does do this.
> 
> But - don't necessarily clone. IIUC,  cloning would possibly mean removing
> that parameter even though it's a field of a struct. But even if you _don't_
> clone, functions calling foo() should still not have to initialize that
> member. It seems like we're talking about different optimizations.

Indeed, you really have a IPA-DSE in your mind (DSE stands for dead
store elimination), that would only affect callers.  We don't have
that, it might be an alternative for to IPA-SRA when we do cannot or
do not want to clone.

Reply via email to