https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531

--- Comment #18 from Jan Hubicka <hubicka at ucw dot cz> ---
> different issue from the one that is raised in the PR.  (Unless we think that
> -O2 and -O3 should always have the same inlining heuristics henceforward, but
> that seems unlikely.)

Yes, I think point of -O3 is to let compiler to be more aggressive than
what seems desirable for your average distro build defaults (which needs
to balance speed and size).
> 
> At the moment, -O3 is essentially -O2 + some -f options + some --param 
> options.
>  Users who want to pick & chose some of the -f options can do so, and can add
> them to stable build systems.  Normally, obsolete -f options are turned into
> no-ops rather than removed.  But users can't pick & choose the --params, and
> add them to stable build systems, because we reserve the right to remove
> --params without warning.

Moreover those --params are slowly chaning their meaning in time.  I
need to retune inliner when early inlining gets smarter.
> 
> So IMO, we should have an -f option that represents “the inlining parameters
> enabled by -O3”, whatever they happen to be for a given release.  It's OK if
> the set is empty.
> 
> For such a change, it doesn't really matter whether the current --params are
> the right ones.  It just matters that the --params are the ones that we
> currently use.  If the --params are changed later, the -f option and -O3 will
> automatically stay in sync.

I am trying to understand how useful this is.  I am basically worried
about two things
 1) we have other optimization passes that behave differently at -O2 and
    -O3 (vectorizer, unrolling etc.) and I think we may want to have
    more. We also have -Os and -O1.

    So perhaps we want kind of more systmatic solution. We already have
    -fvect-cost-model that is kind of vectorizer version of the proposed
    inliner option.
 2) inliner is already quite painful to tune. Especially since 
     one really needs to benchmark packages significantly bigger than
     SPECs which tends to be bit hard to set up and benchmark
     meaningfully. I usually do at least Firefox and clang where the
     first is always quite some work to get working well with latest
     GCC. We SUSE's LNT we also run "C++ behchmarks" which were
     initially collected as kind of inliner tests with higher
     abstraction penalty (tramp3d etc.).

     For many years I benchmarked primarily -O3 and -O3 + profile
     feedbcak on x86-64 only with ocassional look at -O2 and -Os
     behaviour which were generally more stable.
     I also tested other targets (poer and aarch64) but just
     sporadically, which is not good.

     After GCC5 I doubled testing to include both lto/non-lto variant.
     Since GCC10 -O2 started to envolve and needed re-testing too
     (lto/nonlto). One metric I know I ought to tune is -O2 -flto and
     FDO which used to be essentially -O3 before the optimization level
     --params were introduced, but now -O2 + FDO inlining is more
     conservative which hurts, for example, profiledbootstrapped GCC.

     So naturally I am bit worried to introduce even more combinations
     that needs testing and maintenance.  If we add user friendly way to
     tweak this, we also make a promise to keep it sane.

Reply via email to