Hi! On Tue, Jun 17, 2025 at 10:03:28PM +0800, Cui, Lili wrote: > Collected spec2017 performance on ZNVER5, EMR and ICELAKE. No performance > regression was observed. > For O2 multi-copy : > 511.povray_r improved by 2.8% on ZNVER5. > 511.povray_r improved by 4.2% on EMR
No huge improvement, but none was expected anyway, x86 is a target with only few registers. > Tested against SPEC CPU 2017, this change always has a net-positive > effect on the dynamic instruction count. See the following table for > the breakdown on how this reduces the number of dynamic instructions > per workload on a like-for-like (with/without this commit): > > instruction count base with commit (commit-base)/commit > 502.gcc_r 98666845943 96891561634 -1.80% > 526.blender_r 6.21226E+11 6.12992E+11 -1.33% > 520.omnetpp_r 1.1241E+11 1.11093E+11 -1.17% > 500.perlbench_r 1271558717 1263268350 -0.65% > 523.xalancbmk_r 2.20103E+11 2.18836E+11 -0.58% > 531.deepsjeng_r 2.73591E+11 2.72114E+11 -0.54% > 500.perlbench_r 64195557393 63881512409 -0.49% > 541.leela_r 2.99097E+11 2.98245E+11 -0.29% > 548.exchange2_r 1.27976E+11 1.27784E+11 -0.15% > 527.cam4_r 88981458425 88887334679 -0.11% > 554.roms_r 2.60072E+11 2.59809E+11 -0.10% The spec tests most representative for real-life code are perl and gcc, so those are nice results :-) This is code size, dynamic or static does not matter much here. Nice to see it improve anyway :-) > + /* Don't mess with the following registers. */ > + if (frame_pointer_needed) > + bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM); What is that about? Isn't that one of the bigger possible wins? Anyway, nice to see SWS finally used for x86 as well! Segher