https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119010
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
CC| |liuhongt at gcc dot gnu.org,
| |uros at gcc dot gnu.org
Priority|P3 |P1
Last reconfirmed| |2025-02-25
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> ;; | 47 | 1 | xmm11=0.0 nothing
> ;; | 1320 | 1 | xmm22=[`*.LC2'] nothing
> ;; | 1321 | 1 | xmm21=[`*.LC3'] nothing
> ;; | 1680 | 1 | [sp+0xf0]=r8 nothing
> ;; | 1681 | 1 | [sp+0xf8]=r12 nothing
> ;; | 1682 | 1 | [sp+0x100]=r9 nothing
> ;; | 1683 | 1 | [sp+0xc0]=ax nothing
that's *movdf_internal and *movdi_internal
-mtune=sapphirerapids does _not_ reproduces a slowdown (and also not
the ICE with the patch) - for sapphirerapids we have
;; | 53 | 2 | xmm11=0.0 hsw_decodern,hsw_p015
;; | 1476 | 1 | xmm26=[`*.LC2'] hsw_decodern,hsw_p23
;; | 1835 | 1 | [sp+0x110]=r12
hsw_decodern,(hsw_p4+(hsw_p2|hsw_p3|hsw_p7))
;; | 2368 | 1 | [sp+0xd8]=ax
hsw_decodern,(hsw_p4+(hsw_p2|hsw_p3|hsw_p7))
so it looks like a deficiency in the znver4 pipeline description or the insns.
I'll note znver4 also has the 'nothing's (but a much lower
max_lookahead_tries).
The znver automaton also has proper reservations here.