On Tue, May 30, 2017 at 9:28 AM, Peter Zijlstra <pet...@infradead.org> wrote: > On Tue, May 30, 2017 at 06:51:28AM -0700, Andi Kleen wrote: >> On Tue, May 30, 2017 at 11:25:23AM +0200, Peter Zijlstra wrote: >> > On Sun, May 28, 2017 at 01:31:09PM -0700, Stephane Eranian wrote: >> > > Ultimately, I would like to see the watchdog move out of the PMU. That >> > > is the only sensible solution. >> > > You just need a resource able to interrupt on NMI or you handle >> > > interrupt masking in software as has >> > > been proposed on LKML. >> > >> > So even if we do the soft masking, we still need to deal with regions >> > where the interrupts are disabled. Once an interrupt hits the soft mask >> > we still hardware mask. >> > >> > So to get full and reliable coverage we still need an NMI source. >> >> You would only need a single one per system however, not one per CPU. >> RCU already tracks all the CPUs, all we need is a single NMI watchdog >> that makes sure RCU itself does not get stuck. >> >> So we just have to find a single watchdog somewhere that can trigger >> NMI. > > But then you have to IPI broadcast the NMI, which is less than ideal. > > RCU doesn't have that problem because the quiescent state is a global > thing. CPU progress, which is what the NMI watchdog tests, is very much > per logical CPU though. > >> > I agree that it would be lovely to free up the one counter though. >> >> One option is to use the TCO watchdog in the chipset instead. >> Unfortunatley it's not an universal solution because some BIOS lock >> the TCO watchdog for their own use. But if you have a BIOS that >> doesn't do that it should work. > > I suppose you could also route the HPET to the NMI vector and other > similar things. Still, you're then stuck with IPI broadcasts, which > suck. > Can the HPET interrupt (whatever vector) be broadcast to all CPUs by hw?
>> > One other approach is running the watchdog off of _any_ PMI, then all we >> > need to ensure is that PMIs happen semi regularly. There are two cases >> > where this becomes 'interesting': >> >> Seems fairly complex. > > Yes.. :/