On Thu, Mar 21, 2024 at 8:56 PM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 3/21/24 11:19 AM, Vineet Gupta wrote: > > >> > >> So if we go back to Robin's observation that scheduling dramatically > >> increases the instruction count, perhaps we try a run with > >> -fno-schedule-insns -fno-schedule-insns2 and see how the instruction > >> counts compare. > > > > Oh yeah ! Robin hinted to this in Tues patchworks meeting too > > > > default : 2,565,319,368,591 > > 128 : 2,509,741,035,068 > > 256 : 2,527,817,813,612 > > no-sched{,2}: 1,295,520,567,376 > Now we're getting somewhere. That's in line with expectations. > > I would strongly suspect it's -fno-schedule-insns rather than > -fno-schedule-insns2. The former turns off scheduling before register > allocation, the second turns it off after register allocation. So if > our theory about spilling is correct, then it must be the first since > the second won't affect register allocation. While I can speculate > about other potential scheduler impacts, spilling due to sched1's > actions is by far the most likely.
Another option is to enable -fsched-pressure which should help with this issue. > Given the magnitude here, I would bet we can see this pretty clearly if > you've got function level or block level count data for those runs. I'd > start with that, ideally narrowing things down to a function or hot loop > within a function which shows a huge delta. > > From that we can then look at the IRA and LRA dumps and correlate what > we see there with the before/after scheduling dumps to see how we've > lengthened lifetimes in critical locations. > > I'd probably start with the IRA dump. It's going to have annotations in > its dump output like "Potential Spill" which may guide us. In simplest > terms a pseudo is trivially allocatable when it has fewer neighbors in > the conflict graph than available hard registers. If it has more > neighbors in the conflict graph than available hard registers, then it's > potentially going to be spilled -- we can't know during this phase of > allocation. > > As we pop registers off the coloring stack, some neighbors of the pseudo > in question may end up allocated into the same hard register. That can > sometimes result in a hard register being available. It might be easier > to see with a graph > > a--b--c > | > d > > Where a..d are pseudo registers. If two pseudos are connected by an > edge, then they have overlapping lifetimes and can't be allocated to the > same hard register. So as we can see b conflicts with a, c & d. If we > only have two hard registers, then b is not trivially colorable and will > be marked as a potential spill. > > During coloring we may end up allocating a, c & d to the same hard > register (they don't conflict, so its safe). If that happens, then > there would be a register available for b. > > Anyway, that should explain why b would be marked as a potential spill > and how it might end up getting a hard register anyway. > > The hope is we can see the potential spills increasing. At which point > we can walk backwards to sched1 and dive into its scheduling decisions. > > Jeff