https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96337
--- Comment #7 from Jan Hubicka <hubicka at gcc dot gnu.org> --- X265 GCC 9: y4m [info]: 1920x1080 fps 30/1 i420p8 frames 0 - 599 of 600 raw [info]: output file: /dev/null x265 [info]: HEVC encoder version 3.1.2+1-76650bab70f9 x265 [info]: build info [Linux][GCC 9.3.1][64 bit][noasm] 8bit x265 [info]: using cpu capabilities: none! x265 [info]: Main profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: Slices : 1 x265 [info]: frame threads / pool features : 2 / wpp(17 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3 x265 [info]: Keyframe min / max / scenecut / bias: 25 / 250 / 40 / 5.00 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / off / on x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra x265 [info]: tools: strong-intra-smoothing lslices=6 deblock sao x265 [info]: frame I: 3, Avg QP:27.57 kb/s: 14018.64 x265 [info]: frame P: 146, Avg QP:28.84 kb/s: 4313.98 x265 [info]: frame B: 451, Avg QP:35.29 kb/s: 204.06 x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0% x265 [info]: consecutive B-frames: 0.7% 0.0% 0.0% 94.6% 4.7% encoded 600 frames in 279.98s (2.14 fps), 1273.22 kb/s, Avg QP:33.68 1056.04user 1.31system 4:40.01elapsed 377%CPU (0avgtext+0avgdata 432688maxresident)k 0inputs+0outputs (0major+102385minor)pagefaults 0swaps GCC 10: y4m [info]: 1920x1080 fps 30/1 i420p8 frames 0 - 599 of 600 raw [info]: output file: /dev/null x265 [info]: HEVC encoder version 3.1.2+1-76650bab70f9 x265 [info]: build info [Linux][GCC 10.1.1][64 bit][noasm] 8bit x265 [info]: using cpu capabilities: none! x265 [info]: Main profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: Slices : 1 x265 [info]: frame threads / pool features : 2 / wpp(17 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3 x265 [info]: Keyframe min / max / scenecut / bias: 25 / 250 / 40 / 5.00 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / off / on x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra x265 [info]: tools: strong-intra-smoothing lslices=6 deblock sao x265 [info]: frame I: 3, Avg QP:27.57 kb/s: 14018.64 x265 [info]: frame P: 146, Avg QP:28.84 kb/s: 4313.98 x265 [info]: frame B: 451, Avg QP:35.29 kb/s: 204.06 x265 [info]: Weighted P-Frames: Y:0.0% UV:0.0% x265 [info]: consecutive B-frames: 0.7% 0.0% 0.0% 94.6% 4.7% encoded 600 frames in 292.63s (2.05 fps), 1273.22 kb/s, Avg QP:33.68 1079.80user 1.76system 4:52.65elapsed 369%CPU (0avgtext+0avgdata 427464maxresident)k 0inputs+0outputs (0major+73644minor)pagefaults 0swaps So 5% difference instead of 50%. This is a codebase that I would build with -O3. Looking at perf reports there is a difference in inlining. GCC 9: 8.74% x265 libx265.so.176 [.] (anonymous namespace)::satd_8x4 5.67% x265 libx265.so.176 [.] (anonymous namespace)::filterVertical_sp_c<8> 4.44% x265 libx265.so.176 [.] (anonymous namespace)::pixelavg_pp<8, 8> 4.11% x265 libx265.so.176 [.] (anonymous namespace)::psyCost_pp<3> 3.81% x265 libx265.so.176 [.] (anonymous namespace)::interp_horiz_ps_c<8, 64, 64> 3.33% x265 libx265.so.176 [.] (anonymous namespace)::sad<8, 8> 3.29% x265 libx265.so.176 [.] partialButterfly32 GCC 10: 9.17% x265 libx265.so.176 [.] (anonymous namespace)::_sa8d_8x8 8.70% x265 libx265.so.176 [.] (anonymous namespace)::satd_8x4 5.80% x265 libx265.so.176 [.] (anonymous namespace)::pixelavg_pp<8, 8> 5.55% x265 libx265.so.176 [.] (anonymous namespace)::filterVertical_sp_c<8> 3.90% x265 libx265.so.176 [.] (anonymous namespace)::sad<8, 8> 3.71% x265 libx265.so.176 [.] (anonymous namespace)::interp_horiz_ps_c<8, 64, 64> 3.48% x265 libx265.so.176 [.] (anonymous namespace)::sad_x4<8, 8> I build with cmake ../source/ -DCMAKE_CXX_FLAGS=-O2 -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_COMPILER=g++-9 I think phoronix may be missing release flag override so he may be testing -O3 build. GCC 9 inlines _sa8d_8x8 while GCC 10 does not. It is estimated by inliner to 159 insns, so this is indeed the change from --param inline-insns-single dropping it from 200 to 70 for -O2. The default of 200 did not make very good sense for -O2 since inline is abused by C++ codebases (this was main point of the retuning)