On 3/21/24 11:19 AM, Vineet Gupta wrote:


So if we go back to Robin's observation that scheduling dramatically
increases the instruction count, perhaps we try a run with
-fno-schedule-insns -fno-schedule-insns2 and see how the instruction
counts compare.

Oh yeah ! Robin hinted to this in Tues patchworks meeting too

default     : 2,565,319,368,591
128         : 2,509,741,035,068
256         : 2,527,817,813,612
no-sched{,2}: 1,295,520,567,376
Now we're getting somewhere.  That's in line with expectations.

I would strongly suspect it's -fno-schedule-insns rather than -fno-schedule-insns2. The former turns off scheduling before register allocation, the second turns it off after register allocation. So if our theory about spilling is correct, then it must be the first since the second won't affect register allocation. While I can speculate about other potential scheduler impacts, spilling due to sched1's actions is by far the most likely.

Given the magnitude here, I would bet we can see this pretty clearly if you've got function level or block level count data for those runs. I'd start with that, ideally narrowing things down to a function or hot loop within a function which shows a huge delta.

From that we can then look at the IRA and LRA dumps and correlate what we see there with the before/after scheduling dumps to see how we've lengthened lifetimes in critical locations.

I'd probably start with the IRA dump. It's going to have annotations in its dump output like "Potential Spill" which may guide us. In simplest terms a pseudo is trivially allocatable when it has fewer neighbors in the conflict graph than available hard registers. If it has more neighbors in the conflict graph than available hard registers, then it's potentially going to be spilled -- we can't know during this phase of allocation.

As we pop registers off the coloring stack, some neighbors of the pseudo in question may end up allocated into the same hard register. That can sometimes result in a hard register being available. It might be easier to see with a graph

    a--b--c
       |
       d

Where a..d are pseudo registers. If two pseudos are connected by an edge, then they have overlapping lifetimes and can't be allocated to the same hard register. So as we can see b conflicts with a, c & d. If we only have two hard registers, then b is not trivially colorable and will be marked as a potential spill.

During coloring we may end up allocating a, c & d to the same hard register (they don't conflict, so its safe). If that happens, then there would be a register available for b.

Anyway, that should explain why b would be marked as a potential spill and how it might end up getting a hard register anyway.

The hope is we can see the potential spills increasing. At which point we can walk backwards to sched1 and dive into its scheduling decisions.

Jeff

Reply via email to