On 3/21/24 12:56, Jeff Law wrote:
>
> On 3/21/24 11:19 AM, Vineet Gupta wrote:
>
>>> So if we go back to Robin's observation that scheduling dramatically
>>> increases the instruction count, perhaps we try a run with
>>> -fno-schedule-insns -fno-schedule-insns2 and see how the instruction
>>> counts compare.
>> Oh yeah ! Robin hinted to this in Tues patchworks meeting too
>>
>> default : 2,565,319,368,591
>> 128 : 2,509,741,035,068
>> 256 : 2,527,817,813,612
>> no-sched{,2}: 1,295,520,567,376
> Now we're getting somewhere. That's in line with expectations.
>
> I would strongly suspect it's -fno-schedule-insns rather than
> -fno-schedule-insns2. The former turns off scheduling before register
> allocation, the second turns it off after register allocation. So if
> our theory about spilling is correct, then it must be the first since
> the second won't affect register allocation. While I can speculate
> about other potential scheduler impacts, spilling due to sched1's
> actions is by far the most likely.
As always you are absolutely right, just doing -fno-schedule-insns gets
almost the same as last row above.
> Given the magnitude here, I would bet we can see this pretty clearly if
> you've got function level or block level count data for those runs. I'd
> start with that, ideally narrowing things down to a function or hot loop
> within a function which shows a huge delta.
Alright, on it.
Thx,
-Vineet
> From that we can then look at the IRA and LRA dumps and correlate what
> we see there with the before/after scheduling dumps to see how we've
> lengthened lifetimes in critical locations.
>
> I'd probably start with the IRA dump. It's going to have annotations in
> its dump output like "Potential Spill" which may guide us. In simplest
> terms a pseudo is trivially allocatable when it has fewer neighbors in
> the conflict graph than available hard registers. If it has more
> neighbors in the conflict graph than available hard registers, then it's
> potentially going to be spilled -- we can't know during this phase of
> allocation.
>
> As we pop registers off the coloring stack, some neighbors of the pseudo
> in question may end up allocated into the same hard register. That can
> sometimes result in a hard register being available. It might be easier
> to see with a graph
>
> a--b--c
> |
> d
>
> Where a..d are pseudo registers. If two pseudos are connected by an
> edge, then they have overlapping lifetimes and can't be allocated to the
> same hard register. So as we can see b conflicts with a, c & d. If we
> only have two hard registers, then b is not trivially colorable and will
> be marked as a potential spill.
>
> During coloring we may end up allocating a, c & d to the same hard
> register (they don't conflict, so its safe). If that happens, then
> there would be a register available for b.
>
> Anyway, that should explain why b would be marked as a potential spill
> and how it might end up getting a hard register anyway.
>
> The hope is we can see the potential spills increasing. At which point
> we can walk backwards to sched1 and dive into its scheduling decisions.
>
> Jeff