On Thu, Mar 21, 2024 at 8:56 PM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 3/21/24 11:19 AM, Vineet Gupta wrote:
>
> >>
> >> So if we go back to Robin's observation that scheduling dramatically
> >> increases the instruction count, perhaps we try a run with
> >> -fno-schedule-insns -fno-schedule-insns2 and see how the instruction
> >> counts compare.
> >
> > Oh yeah ! Robin hinted to this in Tues patchworks meeting too
> >
> > default           : 2,565,319,368,591
> > 128       : 2,509,741,035,068
> > 256       : 2,527,817,813,612
> > no-sched{,2}: 1,295,520,567,376
> Now we're getting somewhere.  That's in line with expectations.
>
> I would strongly suspect it's -fno-schedule-insns rather than
> -fno-schedule-insns2.  The former turns off scheduling before register
> allocation, the second turns it off after register allocation.  So if
> our theory about spilling is correct, then it must be the first since
> the second won't affect register allocation.   While I can speculate
> about other potential scheduler impacts, spilling due to sched1's
> actions is by far the most likely.

Another option is to enable -fsched-pressure which should help with
this issue.

> Given the magnitude here, I would bet we can see this pretty clearly if
> you've got function level or block level count data for those runs.  I'd
> start with that, ideally narrowing things down to a function or hot loop
> within a function which shows a huge delta.
>
>  From that we can then look at the IRA and LRA dumps and correlate what
> we see there with the before/after scheduling dumps to see how we've
> lengthened lifetimes in critical locations.
>
> I'd probably start with the IRA dump.  It's going to have annotations in
> its dump output like "Potential Spill" which may guide us.  In simplest
> terms a pseudo is trivially allocatable when it has fewer neighbors in
> the conflict graph than available hard registers.  If it has more
> neighbors in the conflict graph than available hard registers, then it's
> potentially going to be spilled -- we can't know during this phase of
> allocation.
>
> As we pop registers off the coloring stack, some neighbors of the pseudo
> in question may end up allocated into the same hard register.  That can
> sometimes result in a hard register being available.  It might be easier
> to see with a graph
>
>      a--b--c
>         |
>         d
>
> Where a..d are pseudo registers.  If two pseudos are connected by an
> edge, then they have overlapping lifetimes and can't be allocated to the
> same hard register.  So as we can see b conflicts with a, c & d.  If we
> only have two hard registers, then b is not trivially colorable and will
> be marked as a potential spill.
>
> During coloring we may end up allocating a, c & d to the same hard
> register (they don't conflict, so its safe).  If that happens, then
> there would be a register available for b.
>
> Anyway, that should explain why b would be marked as a potential spill
> and how it might end up getting a hard register anyway.
>
> The hope is we can see the potential spills increasing.  At which point
> we can walk backwards to sched1 and dive into its scheduling decisions.
>
> Jeff

Reply via email to