Excerpts from Ricardo Neri's message of May 14, 2022 9:16 am: > On Tue, May 10, 2022 at 08:38:22PM +1000, Nicholas Piggin wrote: >> Excerpts from Ricardo Neri's message of May 6, 2022 9:59 am: >> > Certain implementations of the hardlockup detector require support for >> > Inter-Processor Interrupt shorthands. On x86, support for these can only >> > be determined after all the possible CPUs have booted once (in >> > smp_init()). Other architectures may not need such check. >> > >> > lockup_detector_init() only performs the initializations of data >> > structures of the lockup detector. Hence, there are no dependencies on >> > smp_init(). >> > > Thank you for your feedback Nicholas! > >> I think this is the only real thing which affects other watchdog types? > > Also patches 18 and 19 that decouple the NMI watchdog functionality from > perf. > >> >> Not sure if it's a big problem, the secondary CPUs coming up won't >> have their watchdog active until quite late, and the primary could >> implement its own timeout in __cpu_up for secondary coming up, and >> IPI it to get traces if necessary which is probably more robust. > > Indeed that could work. Another alternative I have been pondering is to boot > the system with the perf-based NMI watchdog enabled. Once all CPUs are up > and running, switch to the HPET-based NMI watchdog and free the PMU counters.
Just to cover smp_init()? Unless you could move the watchdog significantly earlier, I'd say it's probably not worth bothering with. Yes the boot CPU is doing *some* work that could lock up, but most complexity is in the secondaries coming up and they won't have their own watchdog coverage for a good chunk of that anyway. If anything I would just add some timeout warning or IPI or something in those wait loops in x86's __cpu_up code if you are worried about catching issues here. Actually the watchdog probably wouldn't catch any of those anyway because they either run with interrupts enabled or touch_nmi_watchdog()! So yeah that'd be pretty pointless. Thanks, Nick