On 5/13/24 11:49, Vineet Gupta wrote:
> 500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 |
> 500.perlbench_r-1 | 740,383,419,739 | 739,280,308,163 |
> 500.perlbench_r-2 | 692,074,638,817 | 691,118,734,547 |
> 502.gcc_r-0 | 190,820,141,435 | 190,857,065,988 |
> 502.gcc_r-1 | 225,747,660,839 | 225,809,444,357 | <- -0.02%
> 502.gcc_r-2 | 220,370,089,641 | 220,406,367,876 | <- -0.03%
> 502.gcc_r-3 | 179,111,460,458 | 179,135,609,723 | <- -0.02%
> 502.gcc_r-4 | 219,301,546,340 | 219,320,416,956 | <- -0.01%
> 503.bwaves_r-0 | 278,733,324,691 | 278,733,323,575 | <- -0.01%
> 503.bwaves_r-1 | 442,397,521,282 | 442,397,519,616 |
> 503.bwaves_r-2 | 344,112,218,206 | 344,112,216,760 |
> 503.bwaves_r-3 | 417,561,469,153 | 417,561,467,597 |
> 505.mcf_r | 669,319,257,525 | 669,318,763,084 |
> 507.cactuBSSN_r | 2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%
The small gcc regression seems like a tooling issue of some sort.
Looking at the topblocks, the insn sequences are exactly the same, only
the counts differ and its not obvious why.
Here's for gcc_r-1.
> Block 0 @ 0x170ca, 12 insns, 87854493 times, 0.47%:
00000000000170ca <find_base_term>:
170ca: 7179 add sp,sp,-48
170cc: ec26 sd s1,24(sp)
170ce: e84a sd s2,16(sp)
170d0: e44e sd s3,8(sp)
170d2: f406 sd ra,40(sp)
170d4: f022 sd s0,32(sp)
170d6: 84aa mv s1,a0
170d8: 03200913 li s2,50
170dc: 03d00993 li s3,61
170e0: 8526 mv a0,s1
170e2: 001cd097 auipc ra,0x1cd
170e6: bac080e7 jalr -1108(ra) # 1e3c8e
<ix86_delegitimize_address.lto_priv.0>
> Block 1 @ 0x706d0a, 3 insns, 274713936 times, 0.37%:
> Block 2 @ 0x1e3c8e, 9 insns, 88507109 times, 0.35%:
...
< Block 0 @ 0x170ca, 12 insns, 87869602 times, 0.47%:
< Block 1 @ 0x706d42, 3 insns, 274608893 times, 0.36%:
< Block 2 @ 0x1e3c94, 9 insns, 88526354 times, 0.35%:
FWIW, Greg internally has been looking at some of this and found some
issues in the bbv tooling, but I wish all of this was shared/upstream
(QEMU bbv plugin) for people to compare notes and not discover/fix the
same issues over and again.
Thx,
-Vineet