https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600
--- Comment #4 from Martin Jambor <jamborm at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #2) > A patch is posted at > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html > > Would you give a try to see if it fixes the regression, I don't currently > have a znver4 machine for testing. Unfortunately it does not. (In reply to Richard Biener from comment #3) > I think we need to figure out what exactly gets slower (and hope it's not > scattered all over the place) I have collected some profiles: r14-5602-ge6269bb69c0734 # Samples: 516K of event 'cycles:u' # Event count (approx.): 468008188417 # Overhead Samples Command Shared Object Symbol # ........ ............ ............... ..................................... ................................................. # 13.55% 69886 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] mc_chroma 11.05% 57017 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_satd_16x16 9.24% 47693 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_satd_8x8 8.67% 44733 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] get_ref 4.84% 24984 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] sub16x16_dct 4.16% 21484 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_me_search_ref 3.30% 17033 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_hadamard_ac_16x16 2.28% 11770 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_satd_4x4 2.10% 10824 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] quant_trellis_cabac 2.07% 10694 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] hpel_filter 2.05% 10616 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] sub8x8_dct 1.86% 9593 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] refine_subpel 1.70% 8788 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] quant_4x4 1.57% 8077 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_sad_16x16 1.16% 6324 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] frame_init_lowres_core 1.14% 5867 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_sa8d_8x8 1.11% 5738 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_cabac_encode_decision_c 1.08% 5736 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_var_16x16 r14-5603-g2b59e2b4dff421 # Samples: 550K of event 'cycles:u' # Event count (approx.): 498834737657 # Overhead Samples Command Shared Object Symbol # ........ ............ ............... ..................................... ................................................. # 18.21% 100151 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_satd_16x16 12.37% 68006 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] mc_chroma 8.51% 46815 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_satd_8x8 7.56% 41560 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] get_ref 4.53% 24901 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] sub16x16_dct 3.92% 21561 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_me_search_ref 3.08% 16963 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_hadamard_ac_16x16 2.41% 13239 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_satd_4x4 1.99% 10931 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] quant_trellis_cabac 1.96% 10801 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] hpel_filter 1.95% 10764 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] sub8x8_dct 1.56% 8587 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] quant_4x4 1.49% 8166 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] refine_subpel 1.48% 8124 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_sad_16x16 1.09% 6328 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] frame_init_lowres_core 1.07% 5901 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_pixel_sa8d_8x8 1.04% 5703 x264_r_peak.min x264_r_peak.mine-pgo-Ofast-native-m64 [.] x264_cabac_encode_decision_c