Module Name: src Committed By: riastradh Date: Fri Jul 7 12:34:50 UTC 2023
Modified Files: src/sys/kern: files.kern init_main.c kern_clock.c kern_cpu.c src/sys/sys: cpu_data.h Added Files: src/share/man/man9: heartbeat.9 src/sys/kern: kern_heartbeat.c src/sys/sys: heartbeat.h Log Message: heartbeat(9): New mechanism to check progress of kernel. This uses hard interrupts to check progress of low-priority soft interrupts, and one CPU to check progress of another CPU. If no progress has been made after a configurable number of seconds (kern.heartbeat.max_period, default 15), then the system panics -- preferably on the CPU that is stuck so we get a stack trace in dmesg of where it was stuck, but if the stuckness was detected by another CPU and the stuck CPU doesn't acknowledge the request to panic within one second, the detecting CPU panics instead. This doesn't supplant hardware watchdog timers. It is possible for hard interrupts to be stuck on all CPUs for some reason too; in that case heartbeat(9) has no opportunity to complete. Downside: heartbeat(9) relies on hardclock to run at a reasonably consistent rate, which might cause trouble for the glorious tickless future. However, it could be adapted to take a parameter for an approximate number of units that have elapsed since the last call on the current CPU, rather than treating that as a constant 1. XXX kernel revbump -- changes struct cpu_info layout To generate a diff of this commit: cvs rdiff -u -r0 -r1.1 src/share/man/man9/heartbeat.9 cvs rdiff -u -r1.57 -r1.58 src/sys/kern/files.kern cvs rdiff -u -r1.541 -r1.542 src/sys/kern/init_main.c cvs rdiff -u -r1.149 -r1.150 src/sys/kern/kern_clock.c cvs rdiff -u -r1.94 -r1.95 src/sys/kern/kern_cpu.c cvs rdiff -u -r0 -r1.1 src/sys/kern/kern_heartbeat.c cvs rdiff -u -r1.52 -r1.53 src/sys/sys/cpu_data.h cvs rdiff -u -r0 -r1.1 src/sys/sys/heartbeat.h Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.