On 16/04/24 at 11:17, Michael Kjörling wrote:
Do I need to set some more settings to ensure that the system will
automatically reboot on a panic? If so, what?
Hi,
In the Linux kernel source are available two options to reboot on panic:
config BOOTPARAM_SOFTLOCKUP_PANIC
bool "Panic (Reboot) On Soft Lockups"
depends on SOFTLOCKUP_DETECTOR
help
Say Y here to enable the kernel to panic on "soft lockups",
which are bugs that cause the kernel to loop in kernel
mode for more than 20 seconds (configurable using the
watchdog_thresh
sysctl), without giving other tasks a chance to run.
The panic can be used in combination with panic_timeout,
to cause the system to reboot automatically after a
lockup has been detected. This feature is useful for
high-availability systems that have uptime guarantees and
where a lockup must be resolved ASAP.
Say N if unsure.
and:
config BOOTPARAM_HARDLOCKUP_PANIC
bool "Panic (Reboot) On Hard Lockups"
depends on HARDLOCKUP_DETECTOR
help
Say Y here to enable the kernel to panic on "hard lockups",
which are bugs that cause the kernel to loop in kernel
mode with interrupts disabled for more than 10 seconds
(configurable
using the watchdog_thresh sysctl).
Say N if unsure.
from Documentation/admin-guide/kernel-parameters.txt you can set it as
kernel parameter or via sysctls:
softlockup_panic=
[KNL] Should the soft-lockup detector generate
panics.
Format: 0 | 1
A value of 1 instructs the soft-lockup detector
to panic the machine when a soft-lockup occurs.
It is
also controlled by the kernel.softlockup_panic
sysctl
and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, which is the
respective build-time switch to that functionality.
and the same for "kernel.hardlockup_panic" that it seems it hasn't an
help entry in the documentation file, I found it here:
nmi_watchdog= [KNL,BUGS=X86] Debugging features for SMP kernels
Format: [panic,][nopanic,][num]
Valid num: 0 or 1
0 - turn hardlockup detector in nmi_watchdog off
1 - turn hardlockup detector in nmi_watchdog on
When panic is specified, panic when an NMI watchdog
timeout occurs (or 'nopanic' to not panic on an NMI
watchdog, if CONFIG_BOOTPARAM_HARDLOCKUP_PANIC
is set)
To disable both hard and soft lockup detectors,
please see 'nowatchdog'.
This is useful when you use a panic=... timeout and
need the box quickly up again.
These settings can be accessed at runtime via
the nmi_watchdog and hardlockup_panic sysctls.
To learn more I suggest to install the "linux-source-6.1" package and
investigate the "Watchdog" option, it is under "Device Drivers".
The BOOTPARAM_SOFTLOCKUP_PANIC and BOOTPARAM_HARDLOCKUP_PANIC options
are under "Kernel hacking" → "Debug Oops, Lockups and Hangs".
Cheers
--
Franco Martelli