The NMI watchdog may become unreliable during runtime. This is the case in x86 if, for instance, the HPET-based hardlockup detector is in use and the TSC counter becomes unstable.
Introduce a new interface to mark the hardlockup detector as unavailable in such cases. When doing this, update the state of /proc/sys/kernel/ nmi_watchdog to keep it consistent. Cc: Andi Kleen <a...@linux.intel.com> Cc: Stephane Eranian <eran...@google.com> Cc: "Ravi V. Shankar" <ravi.v.shan...@intel.com> Cc: io...@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com> --- Changes since v6: * Introduced this patch Changes since v5: * N/A Changes since v4 * N/A Changes since v3 * N/A Changes since v2: * N/A Changes since v1: * N/A --- include/linux/nmi.h | 2 ++ kernel/watchdog.c | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index a38c4509f9eb..40a97139ec65 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -83,9 +83,11 @@ static inline void reset_hung_task_detector(void) { } #if defined(CONFIG_HARDLOCKUP_DETECTOR) extern void hardlockup_detector_disable(void); +extern void hardlockup_detector_mark_unavailable(void); extern unsigned int hardlockup_panic; #else static inline void hardlockup_detector_disable(void) {} +static inline void hardlockup_detector_mark_unavailable(void) {} #endif #if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 8e61f21e7e33..0e4fed6d95b9 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -47,6 +47,8 @@ static int __read_mostly nmi_watchdog_available; struct cpumask watchdog_cpumask __read_mostly; unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask); +static void __lockup_detector_reconfigure(void); + #ifdef CONFIG_HARDLOCKUP_DETECTOR # ifdef CONFIG_SMP @@ -85,6 +87,24 @@ static int __init hardlockup_panic_setup(char *str) } __setup("nmi_watchdog=", hardlockup_panic_setup); +/** + * hardlockup_detector_mark_unavailable - Mark the NMI watchdog as unavailable + * + * Indicate that the hardlockup detector has become unavailable. This may + * happen if the hardware resources that the detector uses have become + * unreliable. + */ +void hardlockup_detector_mark_unavailable(void) +{ + mutex_lock(&watchdog_mutex); + + /* These variables can be updated without stopping the detector. */ + nmi_watchdog_user_enabled = 0; + nmi_watchdog_available = false; + + __lockup_detector_reconfigure(); + mutex_unlock(&watchdog_mutex); +} #endif /* CONFIG_HARDLOCKUP_DETECTOR */ /* -- 2.25.1