Hi, Thanks for the feedback.
(2015/07/23 17:25), Michal Hocko wrote: > Hi, > > On Wed 22-07-15 11:14:21, Hidehiro Kawai wrote: >> When an HA cluster software or administrator detects non-response >> of a host, they issue an NMI to the host to completely stop current >> works and take a crash dump. If the kernel has already panicked >> or is capturing a crash dump at that time, further NMI can cause >> a crash dump failure. >> >> To solve this issue, this patch set does two things: >> >> - Don't panic on NMI if the kernel has already panicked >> - Introduce "noextnmi" boot option which masks external NMI at the >> boot time (supported only for x86) > > I am currently debugging the same issue for our customer. Curiously > enough the issue happens on a Hitachi HW. I found these issues by my white-box testing and source code reading. So, they haven't happened on our customers yet, but possibly happen. > I haven't posted my patch for an upstream review yet because I still > do not have a feedback but I believe your solution is unnecessarily > too complex. Unless I am missing something the following should be enough, > no? Your patch solves some cases, but I think it wouldn't cover all cases where I want to solve. How about the following cases? 1) panic -> acquire panic_lock -> unknown NMI on this CPU -> panic -> failed to acquire panic_lock -> infinite loop ==> no one processes kdump procedure. 2) crash_kexec w/o entering panic -> acquire kexec_mutex -> unknown NMI on this CPU -> panic -> crash_kexec -> failed to acquire kexec_mutex -> return to panic -> smp_send_stop Even if with your patch, case 2) causes infinite loop of try_crash_kexec and no one processes kdump procedure. Regards, > --- >>From ba6ef85d26113e720a630ea22b08efef5b70210f Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mho...@suse.cz> > Date: Fri, 17 Jul 2015 15:17:08 +0200 > Subject: [PATCH] kexec: Never return from crash_kexec when kexex is in > progress > > We had a report when kdump kernel hasn't booted after unknown NMI has > been delivered and unknown_nmi_panic is enabled. The NMI is triggered > by HW and it is delivered to all CPUs at the same time. The machine has > hundreds of CPUs and the most plausible theory is that one CPU really > manages to kick the kexec but it cannot shut down all the CPUs because > they are processing NMI and so cannot process an IPI. Another CPU then > manages to call smp_send_stop from a concurrent panic and this stops the > kexec CPU which has managed to switch to the new kernel and doesn't run > in the NMI mode anymore. > > Fix this by making crash_kexec to never return if there is a kexec in > progress. This can be done easily by relying on the fact that > kexec_mutex will never be released for an ongoing kexec so we just have > to loop over the try lock. The only tricky part is that > kexec_crash_image might be not loaded when we have to return. The check > has to be done under the lock. Extract the trylock and check into > try_crash_kexec and make it return true only if crash kexec is disabled. > > Signed-off-by: Michal Hocko <mho...@suse.cz> > --- > kernel/kexec.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index a785c1015e25..d61b1478167d 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1470,7 +1470,7 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, > initrd_fd, > > #endif /* CONFIG_KEXEC_FILE */ > > -void crash_kexec(struct pt_regs *regs) > +static bool try_crash_kexec(struct pt_regs *regs) > { > /* Take the kexec_mutex here to prevent sys_kexec_load > * running on one cpu from replacing the crash kernel > @@ -1490,7 +1490,20 @@ void crash_kexec(struct pt_regs *regs) > machine_kexec(kexec_crash_image); > } > mutex_unlock(&kexec_mutex); > + return true; > } > + return false; > +} > + > +void crash_kexec(struct pt_regs *regs) > +{ > + /* > + * Never return from this function if a kexec is in progress > + * already because next steps might interfere with it. > + * try_crash_kexec will never succeed in such a case. > + */ > + while (!try_crash_kexec(regs)) > + cpu_relax(); > } > > size_t crash_get_memory_size(void) > -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/