On 5/13/24 15:47, Jeff Law wrote:
>> On 5/13/24 11:49, Vineet Gupta wrote:
>>>   500.perlbench_r-0 |  1,214,534,029,025 | 1,212,887,959,387 |
>>>   500.perlbench_r-1 |    740,383,419,739 |   739,280,308,163 |
>>>   500.perlbench_r-2 |    692,074,638,817 |   691,118,734,547 |
>>>   502.gcc_r-0       |    190,820,141,435 |   190,857,065,988 |
>>>   502.gcc_r-1       |    225,747,660,839 |   225,809,444,357 | <- -0.02%
>>>   502.gcc_r-2       |    220,370,089,641 |   220,406,367,876 | <- -0.03%
>>>   502.gcc_r-3       |    179,111,460,458 |   179,135,609,723 | <- -0.02%
>>>   502.gcc_r-4       |    219,301,546,340 |   219,320,416,956 | <- -0.01%
>>>   503.bwaves_r-0    |    278,733,324,691 |   278,733,323,575 | <- -0.01%
>>>   503.bwaves_r-1    |    442,397,521,282 |   442,397,519,616 |
>>>   503.bwaves_r-2    |    344,112,218,206 |   344,112,216,760 |
>>>   503.bwaves_r-3    |    417,561,469,153 |   417,561,467,597 |
>>>   505.mcf_r         |    669,319,257,525 |   669,318,763,084 |
>>>   507.cactuBSSN_r   |  2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%
>> The small gcc regression seems like a tooling issue of some sort.
>> Looking at the topblocks, the insn sequences are exactly the same, only
>> the counts differ and its not obvious why.
>> Here's for gcc_r-1.
>>
>>
>>      > Block 0 @ 0x170ca, 12 insns, 87854493 times, 0.47%:
>>
>>      00000000000170ca <find_base_term>:
>>         170ca:    7179                    add    sp,sp,-48
>>         170cc:    ec26                    sd    s1,24(sp)
>>         170ce:    e84a                    sd    s2,16(sp)
>>         170d0:    e44e                    sd    s3,8(sp)
>>         170d2:    f406                    sd    ra,40(sp)
>>         170d4:    f022                    sd    s0,32(sp)
>>         170d6:    84aa                    mv    s1,a0
>>         170d8:    03200913              li    s2,50
>>         170dc:    03d00993              li    s3,61
>>         170e0:    8526                    mv    a0,s1
>>         170e2:    001cd097              auipc    ra,0x1cd
>>         170e6:    bac080e7              jalr    -1108(ra) # 1e3c8e
>>      <ix86_delegitimize_address.lto_priv.0>
>>
>>      > Block 1 @ 0x706d0a, 3 insns, 274713936 times, 0.37%:
>>      >  Block 2 @ 0x1e3c8e, 9 insns, 88507109 times, 0.35%:
>>      ...
>>
>>      < Block 0 @ 0x170ca, 12 insns, 87869602 times, 0.47%:
>>      < Block 1 @ 0x706d42, 3 insns, 274608893 times, 0.36%:
>>      < Block 2 @ 0x1e3c94, 9 insns, 88526354 times, 0.35%:
>>
>>
>> FWIW, Greg internally has been looking at some of this and found some
>> issues in the bbv tooling, but I wish all of this was  shared/upstream
>> (QEMU bbv plugin) for people to compare notes and not discover/fix the
>> same issues over and again.
> Yea, we all meant to coordinate on those plugins.  The one we've got had 
> some problems with hash collisions and when there's a hash collision it 
> just produces total junk data.  I chased a few of these down and fixed 
> them about a year ago.
>
> The other thing is qemu will split up blocks based on its internal 
> notion of a translation page.   So if you're looking at block level data 
> you'll stumble over that as well.  This aspect is the most troublesome 
> problem I'm aware of right now.

And these two are exactly what Greg fixed, among others :-)

-Vineet

Reply via email to