On Sat, Dec 12, 2015 at 02:08:13PM -0700, Jeff Merkey wrote: > The current touch_nmi_watchdog() function in /kernel/watchdog.c does > not always catch all cases when a processor is spinning in the nmi > handler inside either KGDB, KDB, or MDB. The hrtimer_interrupts_saved > count can still end up matching the previous value in some cases, > resulting in the hard lockup detector tagging processors inside a
Hi Jeff, I am confused here, the 'touch_nmi_watchdog()' was supposed to block the check for hrtimer_interrupts from happening. So if the check is still being executed _after_ you executed touch_nmi_watchdog(), it would imply there was about 10 seconds or so of time elapse from the touch command to the hrtimer check. So I am not sure how the below patch would fix this, other than just add another 10 second delay (for a total of 20 seconds) to your timeout? > debugger and executing a panic. The patch below corrects this > problem. I did not add this to the touch_nmi_function directly > becuase of possible affects on timing issues. > > I have tested this patch and it fixes the problem for kernel debuggers > stopping errant hard lockup events when processors are spinning inside > the debugger. The kernel doesn't normal take patches like this without a corresponding user, which I didn't see attached in this patch or a patch series. Cheers, Don > > > Signed-off-by: Jeff V. Merkey <[email protected]> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 18f34cf..b682aab 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -283,6 +283,13 @@ static bool is_hardlockup(void) > __this_cpu_write(hrtimer_interrupts_saved, hrint); > return false; > } > + > +void touch_hardlockup_watchdog(void) > +{ > + __this_cpu_write(hrtimer_interrupts_saved, 0); > +} > +EXPORT_SYMBOL_GPL(touch_hardlockup_watchdog); > + > #endif > > static int is_softlockup(unsigned long touch_ts) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

