https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #19 from Rama Malladi ---
Thank you Hubicka@ for the inputs. I see your intent and that we have to
revisit the inline parameter tuning. As I and Richard S mentioned, the intent
of this feature request or PR is to expose such an
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #16 from Rama Malladi ---
I had posted a patch at the URL below for this feature:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655506.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #15 from Rama Malladi ---
Thanks for the comments and for giving us some history/ perspective. I agree
with this statement,
> Pushing up -O2 limits can make sense, but needs to be done carefully -
> in longer term IMO we do not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #13 from Rama Malladi ---
(In reply to Jan Hubicka from comment #12)
> If this is without LTO, can you also try the LTO numbers?
> Inliner behaves sifniciantly different with and without LTO, since LTO
> introduces many (and often
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #11 from Rama Malladi ---
(In reply to Wilco from comment #10)
> A 1.1% overall performance gain looks good - is there a significant codesize
> hit from this? If so, are there slightly less aggressive settings that still
> get most
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #9 from Rama Malladi ---
I wanted us to review this feature implementation given GCC 15 Stage 1
development has started. Thank you.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
Rama Malladi changed:
What|Removed |Added
CC||rvmallad at amazon dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #7 from Rama Malladi ---
(In reply to Rama Malladi from comment #5)
> (In reply to Andrew Pinski from comment #3)
> > Also do you have numbers with lto enabled? Or is these without lto?
> >
> > Does LTO improve the situation for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #5 from Rama Malladi ---
(In reply to Andrew Pinski from comment #3)
> Also do you have numbers with lto enabled? Or is these without lto?
>
> Does LTO improve the situation for Envoy too?
These numbers are without lto. I haven't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #4 from Rama Malladi ---
(In reply to Andrew Pinski from comment #1)
> Maybe we should figure out why the increase of the limits help and add extra
> code to get better heuristics rather than just tweaking the limits.
>
> I know
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
Bug ID: 114531
Summary: Feature proposal for an
`-finline-functions-aggressive` compiler option
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
--- Comment #5 from Rama Malladi ---
Thank you Richard for this patch/ fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
Rama Malladi changed:
What|Removed |Added
CC||rvmallad at amazon dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #23 from Rama Malladi ---
(In reply to Rama Malladi from comment #22)
> I will close this issue as we were unable to reproduce the perf drop going
> from gcc-7 to gcc-8 on a Graviton2 based instance. The performance of
> 519.lbm_r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #22 from Rama Malladi ---
I will close this issue as we were unable to reproduce the perf drop going from
gcc-7 to gcc-8 on a Graviton2 based instance. The performance of 519.lbm_r
built with gcc-7.4 was same as that with gcc-8.5.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #21 from Rama Malladi ---
I did another triage for perf loss on Graviton 2 processor (neoverse-n1) based
instance and found this commit: `a9a4edf0e71bbac9f1b5dcecdcf9250111d16889` to
be the reason. As I had indicated in my earlier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #20 from Rama Malladi ---
@Martin J and @Sebastian P, Let me walk you through the perf data and my
triage.
First, my triage has been on Graviton 3 (neoverse-v1) processor based
instances. Next, I was looking for perf delta going
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #19 from Rama Malladi ---
Thanks @Sebastian and @Martin J. I will get another bisect between GCC 7-and-8.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #15 from Rama Malladi ---
Hi, Can we review this issue and suggest next steps/ action please? Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #14 from Rama Malladi ---
(In reply to Martin Liška from comment #13)
> Note the mentioned revision is a fix and yes, sometimes these revisions can
> end up with a regression as profile estimation is a complex guess.
Yes, possibly.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #12 from Rama Malladi ---
I found difference in dumps at various stages of the compilation for the
mainline GCC and with update_max_bb_count() commented. Here are the details:
Mainline: Commit ID:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #11 from Rama Malladi ---
(In reply to Martin Liška from comment #10)
> @Honza ?
Just checking if this can be fixed/ implemented. Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #19 from Rama Malladi ---
(In reply to Wilco from comment #17)
> (In reply to Rama Malladi from comment #16)
> > (In reply to Wilco from comment #15)
> > > (In reply to Rama Malladi from comment #14)
> > > > This fix also improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #18 from Rama Malladi ---
(In reply to Wilco from comment #17)
> (In reply to Rama Malladi from comment #16)
> > (In reply to Wilco from comment #15)
> > > (In reply to Rama Malladi from comment #14)
> > > > This fix also improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #9 from Rama Malladi ---
(In reply to Martin Liška from comment #3)
> Can you please share perf-profile before and after the revision?
>
> Note I can't see it for Altra aarch64 CPU:
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #16 from Rama Malladi ---
(In reply to Wilco from comment #15)
> (In reply to Rama Malladi from comment #14)
> > This fix also improved performance of 538.imagick_r by 15%. Did you have a
> > similar observation? Thank you.
>
> No,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #14 from Rama Malladi ---
This fix also improved performance of 538.imagick_r by 15%. Did you have a
similar observation? Thank you.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #13 from Rama Malladi ---
(In reply to CVS Commits from comment #12)
> The master branch has been updated by Wilco Dijkstra :
>
> https://gcc.gnu.org/g:0c1b0a23f1fe7db6a2e391b7cb78cff90032
>
> commit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #11 from Rama Malladi ---
(In reply to Wilco from comment #10)
> I'm seeing about 1.5% gain on Neoverse V1 and 0.5% loss on Neoverse N1. I'll
> post a patch that allows per-CPU settings for FMA reassociation, so you'll
> get good
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #9 from Rama Malladi ---
(In reply to Rama Malladi from comment #8)
> (In reply to Wilco from comment #7)
> > The revert results in about 0.5% loss on Neoverse N1, so it looks like the
> > reassociation pass is still splitting FMAs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #8 from Rama Malladi ---
(In reply to Wilco from comment #7)
> The revert results in about 0.5% loss on Neoverse N1, so it looks like the
> reassociation pass is still splitting FMAs into separate MUL and ADD (which
> is bad for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #6 from Rama Malladi ---
The compilation options were: -Ofast -mcpu=native -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #5 from Rama Malladi ---
(In reply to Wilco from comment #2)
> That's interesting - if the reassociation pass has become a bit smarter in
> the last 5 years, we might no longer need this workaround. What is the
> effect on the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107433
--- Comment #2 from Rama Malladi ---
(In reply to Martin Liška from comment #1)
> As mentioned slightly here:
> https://www.spec.org/cpu2017/Docs/benchmarks/510.parest_r.html
> please use -std=c++98 or something < c++17.
Thank you. I had it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #8 from Rama Malladi ---
(In reply to Mark Wielaard from comment #7)
> The content of attachment 53773 [details] has been deleted for the following
> reason:
>
> https://sourceware.org/pipermail/overseers/2022q4/019048.html
Thank
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #6 from Rama Malladi ---
(In reply to Martin Liška from comment #5)
> Please try writing here: overse...@sourceware.org
I have asked for deletion. Thanks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107433
Bug ID: 107433
Summary: 510.parest_r, call of overloaded 'back_interpolate' is
ambiguous
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #4 from Rama Malladi ---
Hi Martin,
Thanks for the guidance. Can we delete the attachment from this bug report?
Regards,
Rama
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #3 from Rama Malladi ---
I will get the effect of this revert for the overall SPEC FP score. I haven't
tried experimenting with fp_reassoc_width values. Will try it and update.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #1 from Rama Malladi ---
$ /home/ubuntu/gccfixissue2/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/ubuntu/gccfixissue2/bin/gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #1 from Rama Malladi ---
$ /home/ubuntu/gccfixissue1/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/ubuntu/gccfixissue1/bin/gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
Bug ID: 107413
Summary: Perf loss ~14% on 519.lbm_r SPEC cpu2017 benchmark
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
Bug ID: 107409
Summary: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
43 matches
Mail list logo