On 3/21/24 11:19 AM, Vineet Gupta wrote:
So if we go back to Robin's observation that scheduling dramatically
increases the instruction count, perhaps we try a run with
-fno-schedule-insns -fno-schedule-insns2 and see how the instruction
counts compare.
Oh yeah ! Robin hinted to this in Tues patchworks meeting too
default : 2,565,319,368,591
128 : 2,509,741,035,068
256 : 2,527,817,813,612
no-sched{,2}: 1,295,520,567,376
Now we're getting somewhere. That's in line with expectations.
I would strongly suspect it's -fno-schedule-insns rather than
-fno-schedule-insns2. The former turns off scheduling before register
allocation, the second turns it off after register allocation. So if
our theory about spilling is correct, then it must be the first since
the second won't affect register allocation. While I can speculate
about other potential scheduler impacts, spilling due to sched1's
actions is by far the most likely.
Given the magnitude here, I would bet we can see this pretty clearly if
you've got function level or block level count data for those runs. I'd
start with that, ideally narrowing things down to a function or hot loop
within a function which shows a huge delta.
From that we can then look at the IRA and LRA dumps and correlate what
we see there with the before/after scheduling dumps to see how we've
lengthened lifetimes in critical locations.
I'd probably start with the IRA dump. It's going to have annotations in
its dump output like "Potential Spill" which may guide us. In simplest
terms a pseudo is trivially allocatable when it has fewer neighbors in
the conflict graph than available hard registers. If it has more
neighbors in the conflict graph than available hard registers, then it's
potentially going to be spilled -- we can't know during this phase of
allocation.
As we pop registers off the coloring stack, some neighbors of the pseudo
in question may end up allocated into the same hard register. That can
sometimes result in a hard register being available. It might be easier
to see with a graph
a--b--c
|
d
Where a..d are pseudo registers. If two pseudos are connected by an
edge, then they have overlapping lifetimes and can't be allocated to the
same hard register. So as we can see b conflicts with a, c & d. If we
only have two hard registers, then b is not trivially colorable and will
be marked as a potential spill.
During coloring we may end up allocating a, c & d to the same hard
register (they don't conflict, so its safe). If that happens, then
there would be a register available for b.
Anyway, that should explain why b would be marked as a potential spill
and how it might end up getting a hard register anyway.
The hope is we can see the potential spills increasing. At which point
we can walk backwards to sched1 and dive into its scheduling decisions.
Jeff