https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116760
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
See Also| |https://gcc.gnu.org/bugzill
| |a/show_bug.cgi?id=104912
Last reconfirmed|2024-09-23 00:00:00 |2024-11-25
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Re-confirmed (comparing 14.2 against trunk on Zen4 with -Ofast -flto
-march=native).
Samples: 1M of event 'cycles:Pu', Event count (approx.): 2401109021645
Overhead Samples Command Shared Object Symbol
12.03% 230087 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
twotff_
11.79% 224014 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
forms_
11.66% 222528 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
forms_
8.44% 160676 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
dirfck_
8.09% 153197 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
dirfck_
6.27% 119537 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
twotff_
5.89% 111667 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
xyzint_
5.21% 99376 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
xyzint_
3.02% 57506 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
genral_
2.36% 44702 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
genral_
1.62% 30954 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
zqout_
1.56% 29806 gamess_peak.amd gamess_peak.amd64-m64-gcc42-nn [.]
twoei_.constprop.2
1.53% 29092 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
twoei_.constprop.2
1.40% 26663 gamess_base.amd gamess_base.amd64-m64-gcc42-nn [.]
zqout_
so the main thing is the usual suspect, the "triangular" loop
MKL=0
DO 10 MK=1,NOC
DO 10 ML=1,MK
MKL = MKL+1
XPQKL(MPQ,MKL) = XPQKL(MPQ,MKL) +
* VAL1*(CO(MS,MK)*CO(MR,ML)+CO(MS,ML)*CO(MR,MK))
XPQKL(MRS,MKL) = XPQKL(MRS,MKL) +
* VAL3*(CO(MQ,MK)*CO(MP,ML)+CO(MQ,ML)*CO(MP,MK))
10 CONTINUE
where previously I massaged costing to have the loop _not_ vectorized
but that doesn't work anymore it seems.