On Mon, 13 May 2024 16:08:21 PDT (-0700), Vineet Gupta wrote:
On 5/13/24 15:47, Jeff Law wrote:On 5/13/24 11:49, Vineet Gupta wrote:500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 | 500.perlbench_r-1 | 740,383,419,739 | 739,280,308,163 | 500.perlbench_r-2 | 692,074,638,817 | 691,118,734,547 | 502.gcc_r-0 | 190,820,141,435 | 190,857,065,988 | 502.gcc_r-1 | 225,747,660,839 | 225,809,444,357 | <- -0.02% 502.gcc_r-2 | 220,370,089,641 | 220,406,367,876 | <- -0.03% 502.gcc_r-3 | 179,111,460,458 | 179,135,609,723 | <- -0.02% 502.gcc_r-4 | 219,301,546,340 | 219,320,416,956 | <- -0.01% 503.bwaves_r-0 | 278,733,324,691 | 278,733,323,575 | <- -0.01% 503.bwaves_r-1 | 442,397,521,282 | 442,397,519,616 | 503.bwaves_r-2 | 344,112,218,206 | 344,112,216,760 | 503.bwaves_r-3 | 417,561,469,153 | 417,561,467,597 | 505.mcf_r | 669,319,257,525 | 669,318,763,084 | 507.cactuBSSN_r | 2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%The small gcc regression seems like a tooling issue of some sort. Looking at the topblocks, the insn sequences are exactly the same, only the counts differ and its not obvious why. Here's for gcc_r-1. > Block 0 @ 0x170ca, 12 insns, 87854493 times, 0.47%: 00000000000170ca <find_base_term>: 170ca: 7179 add sp,sp,-48 170cc: ec26 sd s1,24(sp) 170ce: e84a sd s2,16(sp) 170d0: e44e sd s3,8(sp) 170d2: f406 sd ra,40(sp) 170d4: f022 sd s0,32(sp) 170d6: 84aa mv s1,a0 170d8: 03200913 li s2,50 170dc: 03d00993 li s3,61 170e0: 8526 mv a0,s1 170e2: 001cd097 auipc ra,0x1cd 170e6: bac080e7 jalr -1108(ra) # 1e3c8e <ix86_delegitimize_address.lto_priv.0> > Block 1 @ 0x706d0a, 3 insns, 274713936 times, 0.37%: > Block 2 @ 0x1e3c8e, 9 insns, 88507109 times, 0.35%: ... < Block 0 @ 0x170ca, 12 insns, 87869602 times, 0.47%: < Block 1 @ 0x706d42, 3 insns, 274608893 times, 0.36%: < Block 2 @ 0x1e3c94, 9 insns, 88526354 times, 0.35%: FWIW, Greg internally has been looking at some of this and found some issues in the bbv tooling, but I wish all of this was shared/upstream (QEMU bbv plugin) for people to compare notes and not discover/fix the same issues over and again.Yea, we all meant to coordinate on those plugins. The one we've got had some problems with hash collisions and when there's a hash collision it just produces total junk data. I chased a few of these down and fixed them about a year ago. The other thing is qemu will split up blocks based on its internal notion of a translation page. So if you're looking at block level data you'll stumble over that as well. This aspect is the most troublesome problem I'm aware of right now.And these two are exactly what Greg fixed, among others :-)
IIRC the plan was for Jeff to send his version to the QEMU lists so we can talk about it over there. Do you want us to just send Greg's version instead? It's all based on the same original patch from the QEMU lists, just with possibly-different set of fixes.
-Vineet