On 3/21/24 12:56, Jeff Law wrote:
>
> On 3/21/24 11:19 AM, Vineet Gupta wrote:
>
>>> So if we go back to Robin's observation that scheduling dramatically
>>> increases the instruction count, perhaps we try a run with
>>> -fno-schedule-insns -fno-schedule-insns2 and see how the instruction
>>> counts compare.
>> Oh yeah ! Robin hinted to this in Tues patchworks meeting too
>>
>> default          : 2,565,319,368,591
>> 128      : 2,509,741,035,068
>> 256      : 2,527,817,813,612
>> no-sched{,2}: 1,295,520,567,376
> Now we're getting somewhere.  That's in line with expectations.
>
> I would strongly suspect it's -fno-schedule-insns rather than 
> -fno-schedule-insns2.  The former turns off scheduling before register 
> allocation, the second turns it off after register allocation.  So if 
> our theory about spilling is correct, then it must be the first since 
> the second won't affect register allocation.   While I can speculate 
> about other potential scheduler impacts, spilling due to sched1's 
> actions is by far the most likely.

As always you are absolutely right, just doing -fno-schedule-insns gets
almost the same as last row above.

> Given the magnitude here, I would bet we can see this pretty clearly if 
> you've got function level or block level count data for those runs.  I'd 
> start with that, ideally narrowing things down to a function or hot loop 
> within a function which shows a huge delta.

Alright, on it.

Thx,
-Vineet

> From that we can then look at the IRA and LRA dumps and correlate what 
> we see there with the before/after scheduling dumps to see how we've 
> lengthened lifetimes in critical locations.
>
> I'd probably start with the IRA dump.  It's going to have annotations in 
> its dump output like "Potential Spill" which may guide us.  In simplest 
> terms a pseudo is trivially allocatable when it has fewer neighbors in 
> the conflict graph than available hard registers.  If it has more 
> neighbors in the conflict graph than available hard registers, then it's 
> potentially going to be spilled -- we can't know during this phase of 
> allocation.
>
> As we pop registers off the coloring stack, some neighbors of the pseudo 
> in question may end up allocated into the same hard register.  That can 
> sometimes result in a hard register being available.  It might be easier 
> to see with a graph
>
>      a--b--c
>         |
>         d
>
> Where a..d are pseudo registers.  If two pseudos are connected by an 
> edge, then they have overlapping lifetimes and can't be allocated to the 
> same hard register.  So as we can see b conflicts with a, c & d.  If we 
> only have two hard registers, then b is not trivially colorable and will 
> be marked as a potential spill.
>
> During coloring we may end up allocating a, c & d to the same hard 
> register (they don't conflict, so its safe).  If that happens, then 
> there would be a register available for b.
>
> Anyway, that should explain why b would be marked as a potential spill 
> and how it might end up getting a hard register anyway.
>
> The hope is we can see the potential spills increasing.  At which point 
> we can walk backwards to sched1 and dive into its scheduling decisions.
>
> Jeff

Reply via email to