Hi!

On Tue, Jun 17, 2025 at 10:03:28PM +0800, Cui, Lili wrote:
> Collected spec2017 performance on ZNVER5, EMR and ICELAKE. No performance 
> regression was observed.
> For O2 multi-copy :
> 511.povray_r improved by 2.8% on ZNVER5.
> 511.povray_r improved by 4.2% on EMR

No huge improvement, but none was expected anyway, x86 is a target with
only few registers.

> Tested against SPEC CPU 2017, this change always has a net-positive
> effect on the dynamic instruction count.  See the following table for
> the breakdown on how this reduces the number of dynamic instructions
> per workload on a like-for-like (with/without this commit):
> 
> instruction count       base            with commit (commit-base)/commit
> 502.gcc_r             98666845943     96891561634     -1.80%
> 526.blender_r         6.21226E+11     6.12992E+11     -1.33%
> 520.omnetpp_r         1.1241E+11      1.11093E+11     -1.17%
> 500.perlbench_r       1271558717      1263268350      -0.65%
> 523.xalancbmk_r               2.20103E+11     2.18836E+11     -0.58%
> 531.deepsjeng_r               2.73591E+11     2.72114E+11     -0.54%
> 500.perlbench_r       64195557393     63881512409     -0.49%
> 541.leela_r           2.99097E+11     2.98245E+11     -0.29%
> 548.exchange2_r               1.27976E+11     1.27784E+11     -0.15%
> 527.cam4_r            88981458425     88887334679     -0.11%
> 554.roms_r            2.60072E+11     2.59809E+11     -0.10%

The spec tests most representative for real-life code are perl and gcc,
so those are nice results :-)

This is code size, dynamic or static does not matter much here.  Nice to
see it improve anyway :-)

> +  /* Don't mess with the following registers.  */
> +  if (frame_pointer_needed)
> +    bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM);

What is that about?  Isn't that one of the bigger possible wins?

Anyway, nice to see SWS finally used for x86 as well!


Segher

Reply via email to