Peter saids -tip tree doesn't have panic_on_unrecovered_nmi in the
previoius discussion, but it still exists. So, I didn't change
anything about panic_on_unrecovered_nmi.
Thanks,
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group
> From: Hidehiro Kawai [mailto:hidehiro.kawai...@hitachi.c
nmi_shootdown_cpus(), a subroutine of crash_kexec(), sends NMI IPI
to non-panic cpus to stop them while saving their register
information and doing some cleanups for crash dumping. So if a
non-panic cpus is infinitely looping in NMI context, we fail to
save its register information and lose the in
If panic on NMI happens just after panic() on the same CPU, panic()
is recursively called. As the result, it stalls after failing to
acquire panic_lock.
To avoid this problem, don't call panic() in NMI context if
we've already entered panic().
V4:
- Improve comments in io_check_error() and panic
This patch introduces new boot option "noextnmi" which disables
external NMI. This option is useful for the dump capture kernel
so that an HA application or administrator wouldn't mistakenly
shoot down the kernel by NMI.
Currently, only x86 supports this option.
Signed-off-by: Hidehiro Kawai
Cc
Currently, panic() and crash_kexec() can be called at the same time.
For example (x86 case):
CPU 0:
oops_end()
crash_kexec()
mutex_trylock() // acquired
nmi_shootdown_cpus() // stop other cpus
CPU 1:
panic()
crash_kexec()
mutex_trylock() // failed to acquire
sm
When an HA clustering software or administrator detects unresponsivenes
of a host, they issue an NMI to the host to completely stop current
works and take a crash dump. If the kernel has already panicked
or is capturing a crash dump at that time, further NMI can cause
a crash dump failure.
Also,
kexec output message misses the prefix "kexec", when Dave Young split
the kexec code. Now, we use file name as the output message prefix.
Currently, the format of output message:
[ 140.290795] SYSC_kexec_load: hello, world
[ 140.291534] kexec: sanity_check_segment_list: hello, world
Ideally, th