106265])

Jeff Law Mon, 13 May 2024 15:47:58 -0700



On 5/13/24 3:13 PM, Vineet Gupta wrote:

On 5/13/24 11:49, Vineet Gupta wrote:

  500.perlbench_r-0 |  1,214,534,029,025 | 1,212,887,959,387 |
  500.perlbench_r-1 |    740,383,419,739 |   739,280,308,163 |
  500.perlbench_r-2 |    692,074,638,817 |   691,118,734,547 |
  502.gcc_r-0       |    190,820,141,435 |   190,857,065,988 |
  502.gcc_r-1       |    225,747,660,839 |   225,809,444,357 | <- -0.02%
  502.gcc_r-2       |    220,370,089,641 |   220,406,367,876 | <- -0.03%
  502.gcc_r-3       |    179,111,460,458 |   179,135,609,723 | <- -0.02%
  502.gcc_r-4       |    219,301,546,340 |   219,320,416,956 | <- -0.01%
  503.bwaves_r-0    |    278,733,324,691 |   278,733,323,575 | <- -0.01%
  503.bwaves_r-1    |    442,397,521,282 |   442,397,519,616 |
  503.bwaves_r-2    |    344,112,218,206 |   344,112,216,760 |
  503.bwaves_r-3    |    417,561,469,153 |   417,561,467,597 |
  505.mcf_r         |    669,319,257,525 |   669,318,763,084 |
  507.cactuBSSN_r   |  2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%


The small gcc regression seems like a tooling issue of some sort.
Looking at the topblocks, the insn sequences are exactly the same, only
the counts differ and its not obvious why.
Here's for gcc_r-1.


     > Block 0 @ 0x170ca, 12 insns, 87854493 times, 0.47%:

     00000000000170ca <find_base_term>:
        170ca:    7179                    add    sp,sp,-48
        170cc:    ec26                    sd    s1,24(sp)
        170ce:    e84a                    sd    s2,16(sp)
        170d0:    e44e                    sd    s3,8(sp)
        170d2:    f406                    sd    ra,40(sp)
        170d4:    f022                    sd    s0,32(sp)
        170d6:    84aa                    mv    s1,a0
        170d8:    03200913              li    s2,50
        170dc:    03d00993              li    s3,61
        170e0:    8526                    mv    a0,s1
        170e2:    001cd097              auipc    ra,0x1cd
        170e6:    bac080e7              jalr    -1108(ra) # 1e3c8e
     <ix86_delegitimize_address.lto_priv.0>

     > Block 1 @ 0x706d0a, 3 insns, 274713936 times, 0.37%:
     >  Block 2 @ 0x1e3c8e, 9 insns, 88507109 times, 0.35%:
     ...

     < Block 0 @ 0x170ca, 12 insns, 87869602 times, 0.47%:
     < Block 1 @ 0x706d42, 3 insns, 274608893 times, 0.36%:
     < Block 2 @ 0x1e3c94, 9 insns, 88526354 times, 0.35%:


FWIW, Greg internally has been looking at some of this and found some
issues in the bbv tooling, but I wish all of this was  shared/upstream
(QEMU bbv plugin) for people to compare notes and not discover/fix the
same issues over and again.

Yea, we all meant to coordinate on those plugins. The one we've got hadsome problems with hash collisions and when there's a hash collision itjust produces total junk data. I chased a few of these down and fixedthem about a year ago.

The other thing is qemu will split up blocks based on its internalnotion of a translation page. So if you're looking at block level datayou'll stumble over that as well. This aspect is the most troublesomeproblem I'm aware of right now.






Jeff

Re: Follow up #1 (was Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265])

Reply via email to