Re: [PATCH V3 11/11] panic: Fixes the panic_print NMI backtrace setting
On 19/08/2022 19:17, Guilherme G. Piccoli wrote: > Commit 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in > panic_print") > introduced a setting for the "panic_print" kernel parameter to allow > users to request a NMI backtrace on panic. Problem is that the panic_print > handling happens after the secondary CPUs are already disabled, hence > this option ended-up being kind of a no-op - kernel skips the NMI trace > in idling CPUs, which is the case of offline CPUs. > > Fix it by checking the NMI backtrace bit in the panic_print prior to > the CPU disabling function. > > Fixes: 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in > panic_print") > Cc: Feng Tang > Cc: Petr Mladek > Signed-off-by: Guilherme G. Piccoli > > --- > > V3: > - No changes. > > V2: > - new patch, there was no V1 of this one. > > Hi folks, thanks upfront for reviews. This is a new patch, fixing an issue > I found in my tests, so I shoved it into this fixes series. > > Notice that while at it, I got rid of the "crash_kexec_post_notifiers" > local copy in panic(). This was introduced by commit b26e27ddfd2a > ("kexec: use core_param for crash_kexec_post_notifiers boot option"), > but it is not clear from comments or commit message why this local copy > is required. > > My understanding is that it's a mechanism to prevent some concurrency, > in case some other CPU modify this variable while panic() is running. > I find it very unlikely, hence I removed it - but if people consider > this copy needed, I can respin this patch and keep it, even providing a > comment about that, in order to be explict about its need. > > Let me know your thoughts! Cheers, > > Guilherme > > Hi folks, bi-monthly ping - apologies for the noise heh Is there anything suggested so we can get this fix merged in 6.2? Any suggestions / reviews are much appreciated. Tnx in advance, Guilherme ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH V3 11/11] panic: Fixes the panic_print NMI backtrace setting
On 19/08/2022 19:17, Guilherme G. Piccoli wrote: > Commit 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in > panic_print") > introduced a setting for the "panic_print" kernel parameter to allow > users to request a NMI backtrace on panic. Problem is that the panic_print > handling happens after the secondary CPUs are already disabled, hence > this option ended-up being kind of a no-op - kernel skips the NMI trace > in idling CPUs, which is the case of offline CPUs. > > Fix it by checking the NMI backtrace bit in the panic_print prior to > the CPU disabling function. > > Fixes: 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in > panic_print") > Cc: Feng Tang > Cc: Petr Mladek > Signed-off-by: Guilherme G. Piccoli > > --- > > V3: > - No changes. > > V2: > - new patch, there was no V1 of this one. > > Hi folks, thanks upfront for reviews. This is a new patch, fixing an issue > I found in my tests, so I shoved it into this fixes series. > > Notice that while at it, I got rid of the "crash_kexec_post_notifiers" > local copy in panic(). This was introduced by commit b26e27ddfd2a > ("kexec: use core_param for crash_kexec_post_notifiers boot option"), > but it is not clear from comments or commit message why this local copy > is required. > > My understanding is that it's a mechanism to prevent some concurrency, > in case some other CPU modify this variable while panic() is running. > I find it very unlikely, hence I removed it - but if people consider > this copy needed, I can respin this patch and keep it, even providing a > comment about that, in order to be explict about its need. > > Let me know your thoughts! Cheers, > > Guilherme > > > kernel/panic.c | 47 +++ > 1 file changed, 27 insertions(+), 20 deletions(-) > [...] Hi Andrew, sorry for the ping. Does the patch makes sense for you? Any comments are much appreciated! Tnx in advance, Guilherme ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH V3 11/11] panic: Fixes the panic_print NMI backtrace setting
Commit 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in panic_print") introduced a setting for the "panic_print" kernel parameter to allow users to request a NMI backtrace on panic. Problem is that the panic_print handling happens after the secondary CPUs are already disabled, hence this option ended-up being kind of a no-op - kernel skips the NMI trace in idling CPUs, which is the case of offline CPUs. Fix it by checking the NMI backtrace bit in the panic_print prior to the CPU disabling function. Fixes: 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in panic_print") Cc: Feng Tang Cc: Petr Mladek Signed-off-by: Guilherme G. Piccoli --- V3: - No changes. V2: - new patch, there was no V1 of this one. Hi folks, thanks upfront for reviews. This is a new patch, fixing an issue I found in my tests, so I shoved it into this fixes series. Notice that while at it, I got rid of the "crash_kexec_post_notifiers" local copy in panic(). This was introduced by commit b26e27ddfd2a ("kexec: use core_param for crash_kexec_post_notifiers boot option"), but it is not clear from comments or commit message why this local copy is required. My understanding is that it's a mechanism to prevent some concurrency, in case some other CPU modify this variable while panic() is running. I find it very unlikely, hence I removed it - but if people consider this copy needed, I can respin this patch and keep it, even providing a comment about that, in order to be explict about its need. Let me know your thoughts! Cheers, Guilherme kernel/panic.c | 47 +++ 1 file changed, 27 insertions(+), 20 deletions(-) diff --git a/kernel/panic.c b/kernel/panic.c index c6eb8f8db0c0..b025a8f21c17 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -180,9 +180,6 @@ static void panic_print_sys_info(bool console_flush) return; } - if (panic_print & PANIC_PRINT_ALL_CPU_BT) - trigger_all_cpu_backtrace(); - if (panic_print & PANIC_PRINT_TASK_INFO) show_state(); @@ -199,6 +196,30 @@ static void panic_print_sys_info(bool console_flush) ftrace_dump(DUMP_ALL); } +/* + * Helper that triggers the NMI backtrace (if set in panic_print) + * and then performs the secondary CPUs shutdown - we cannot have + * the NMI backtrace after the CPUs are off! + */ +static void panic_other_cpus_shutdown(void) +{ + if (panic_print & PANIC_PRINT_ALL_CPU_BT) + trigger_all_cpu_backtrace(); + + /* +* Note that smp_send_stop() is the usual SMP shutdown function, +* which unfortunately may not be hardened to work in a panic +* situation. If we want to do crash dump after notifier calls +* and kmsg_dump, we will need architecture dependent extra +* bits in addition to stopping other CPUs, hence we rely on +* crash_smp_send_stop() for that. +*/ + if (!crash_kexec_post_notifiers) + smp_send_stop(); + else + crash_smp_send_stop(); +} + /** * panic - halt the system * @fmt: The text string to print @@ -214,7 +235,6 @@ void panic(const char *fmt, ...) long i, i_next = 0, len; int state = 0; int old_cpu, this_cpu; - bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers; if (panic_on_warn) { /* @@ -289,23 +309,10 @@ void panic(const char *fmt, ...) * * Bypass the panic_cpu check and call __crash_kexec directly. */ - if (!_crash_kexec_post_notifiers) { + if (!crash_kexec_post_notifiers) __crash_kexec(NULL); - /* -* Note smp_send_stop is the usual smp shutdown function, which -* unfortunately means it may not be hardened to work in a -* panic situation. -*/ - smp_send_stop(); - } else { - /* -* If we want to do crash dump after notifier calls and -* kmsg_dump, we will need architecture dependent extra -* works in addition to stopping other CPUs. -*/ - crash_smp_send_stop(); - } + panic_other_cpus_shutdown(); /* * Run any panic handlers, including those that might need to @@ -326,7 +333,7 @@ void panic(const char *fmt, ...) * * Bypass the panic_cpu check and call __crash_kexec directly. */ - if (_crash_kexec_post_notifiers) + if (crash_kexec_post_notifiers) __crash_kexec(NULL); #ifdef CONFIG_VT -- 2.37.2 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec