Re: Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
Hi all, What is the current status of this bug fix patch? I think it's OK if resending Hatayama-san's patch with Ingo's. Thanks, (2015/03/25 2:04), Vivek Goyal wrote: > On Tue, Mar 24, 2015 at 05:18:14PM +0100, Ingo Molnar wrote: >> >> * Vivek Goyal wrote: >> Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was clearly not a no-op in the default case, against expectations. >>> >>> Hi Ingo, >>> >>> I did a quick test and in default case crash_kexec() runs before >>> panic notifiers. So it does look like crash_kexec_post_notifiers is >>> a no-op in default case. >>> >>> What am I missing. >> >> Well, look at f06e5153f4ae: >> >> diff --git a/kernel/panic.c b/kernel/panic.c >> index d02fa9fef46a..62e16cef9cc2 100644 >> --- a/kernel/panic.c >> +++ b/kernel/panic.c >> @@ -32,6 +32,7 @@ static unsigned long tainted_mask; >> static int pause_on_oops; >> static int pause_on_oops_flag; >> static DEFINE_SPINLOCK(pause_on_oops_lock); >> +static bool crash_kexec_post_notifiers; >> >> int panic_timeout = CONFIG_PANIC_TIMEOUT; >> EXPORT_SYMBOL_GPL(panic_timeout); >> @@ -112,9 +113,11 @@ void panic(const char *fmt, ...) >> /* >> * If we have crashed and we have a crash kernel loaded let it handle >> * everything else. >> - * Do we want to call this before we try to display a message? >> + * If we want to run this after calling panic_notifiers, pass >> + * the "crash_kexec_post_notifiers" option to the kernel. >> */ >> -crash_kexec(NULL); >> +if (!crash_kexec_post_notifiers) >> +crash_kexec(NULL); >> >> /* >> * Note smp_send_stop is the usual smp shutdown function, which >> @@ -131,6 +134,15 @@ void panic(const char *fmt, ...) >> >> kmsg_dump(KMSG_DUMP_PANIC); >> >> +/* >> + * If you doubt kdump always works fine in any situation, >> + * "crash_kexec_post_notifiers" offers you a chance to run >> + * panic_notifiers and dumping kmsg before kdump. >> + * Note: since some panic_notifiers can make crashed kernel >> + * more unstable, it can increase risks of the kdump failure too. >> + */ >> +crash_kexec(NULL); >> + >> bust_spinlocks(0); >> >> if (!panic_blink) >> >> >> Without knowing what crash_kexec() does, the patch looks buggy: it >> should preserve the old behavior by default, yet it will now execute a >> second crash_kexec() after the kmsg_dump() line. >> >> So the invariant change would have been to do: >> >> -crash_kexec(NULL); >> +if (!crash_kexec_post_notifiers) >> +crash_kexec(NULL); >> >> ... >> >> +if (crash_kexec_post_notifiers) >> +crash_kexec(NULL); >> >> Which in the !crash_kexec_post_notifiers flag case reduces to: >> >> crash_kexec(); >> >> ... >> >> /* NOP */ >> >> I.e. to exactly what the kernel was doing without the patch >> originally. >> >> Which is what my patch does. Nothing more, nothing less. > > Ok, I got it what you mean. > > crash_kexec() does not return if a kdump kernel is loaded. If kdump > kernel is not loaded, then crash_kexec() returns without doing anything. > > I think that explains why not making second call to crash_kexec() under > if, did not create problems. In first case it will never be called and > in second case, it will do nothing and simply return back. > > But anyway, we need your patch as that's right thing to do. > > Thanks > Vivek > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > -- Hidehiro Kawai Hitachi, Ltd. Research & Development Group -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
Hello all, (2015/03/24 23:32), Vivek Goyal wrote: > On Tue, Mar 24, 2015 at 05:27:10AM -0500, Eric W. Biederman wrote: >> Ingo Molnar writes: >> >>> * Masami Hiramatsu wrote: >>> > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option > for kdump after panic_notifers") > > Was that crash_kexec() was called unconditionally after notifiers were > called, which should be fixed via the simple patch below (untested). > Looks much simpler than your fix. No, Daisuke's patch is not for that case. [...] >>> >>> Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was >>> clearly not a no-op in the default case, against expectations. >>> >>> So the first step should be to restore the original behavior (my >>> patch), then should any new tweaks be added. >> >> Honestly I think the proper fix is to simply revert f06e5153f4ae. >> >> It was clearly not properly tested by the people who wanted it because >> they came back quite a while later with additional bleh. >> >> I think this pretty much counts as hitting the code doesn't work let's >> remove it threshold. > > IMHO, we should give users flexibility of running panic notifiers before > crash_kexec(). Different people have been asking for it since last 7-8 > years and it is a pretty small code in kernel so no major maintenance > headache. > > Agreed that this might be very unreliable, but if users want to shoot > themseleves in the foot, it is their choice. This will not be upstream > default and I am hoping that distributions don't make it their default > either. We are going to use panic notifier to write SEL record, and actually it seems to be unreliable. At least I found two problems in IPMI driver code while testing Hatayama-san's patch, and they will cause an infinite loop. I think users wouldn't notice this bug because most of users use kdump and there is no difference on display between the infinite loop case and successful case. Anyway, we need to harden panic notifier callee. I will post bug fix patches for IPMI driver ASAP. Best regards, Hidehiro Kawai Hitachi, Yokohama Research Laboratory -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Tue, Mar 24, 2015 at 05:18:14PM +0100, Ingo Molnar wrote: > > * Vivek Goyal wrote: > > > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' > > > was clearly not a no-op in the default case, against expectations. > > > > Hi Ingo, > > > > I did a quick test and in default case crash_kexec() runs before > > panic notifiers. So it does look like crash_kexec_post_notifiers is > > a no-op in default case. > > > > What am I missing. > > Well, look at f06e5153f4ae: > > diff --git a/kernel/panic.c b/kernel/panic.c > index d02fa9fef46a..62e16cef9cc2 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -32,6 +32,7 @@ static unsigned long tainted_mask; > static int pause_on_oops; > static int pause_on_oops_flag; > static DEFINE_SPINLOCK(pause_on_oops_lock); > +static bool crash_kexec_post_notifiers; > > int panic_timeout = CONFIG_PANIC_TIMEOUT; > EXPORT_SYMBOL_GPL(panic_timeout); > @@ -112,9 +113,11 @@ void panic(const char *fmt, ...) > /* >* If we have crashed and we have a crash kernel loaded let it handle >* everything else. > - * Do we want to call this before we try to display a message? > + * If we want to run this after calling panic_notifiers, pass > + * the "crash_kexec_post_notifiers" option to the kernel. >*/ > - crash_kexec(NULL); > + if (!crash_kexec_post_notifiers) > + crash_kexec(NULL); > > /* >* Note smp_send_stop is the usual smp shutdown function, which > @@ -131,6 +134,15 @@ void panic(const char *fmt, ...) > > kmsg_dump(KMSG_DUMP_PANIC); > > + /* > + * If you doubt kdump always works fine in any situation, > + * "crash_kexec_post_notifiers" offers you a chance to run > + * panic_notifiers and dumping kmsg before kdump. > + * Note: since some panic_notifiers can make crashed kernel > + * more unstable, it can increase risks of the kdump failure too. > + */ > + crash_kexec(NULL); > + > bust_spinlocks(0); > > if (!panic_blink) > > > Without knowing what crash_kexec() does, the patch looks buggy: it > should preserve the old behavior by default, yet it will now execute a > second crash_kexec() after the kmsg_dump() line. > > So the invariant change would have been to do: > > - crash_kexec(NULL); > + if (!crash_kexec_post_notifiers) > + crash_kexec(NULL); > > ... > > + if (crash_kexec_post_notifiers) > + crash_kexec(NULL); > > Which in the !crash_kexec_post_notifiers flag case reduces to: > > crash_kexec(); > > ... > > /* NOP */ > > I.e. to exactly what the kernel was doing without the patch > originally. > > Which is what my patch does. Nothing more, nothing less. Ok, I got it what you mean. crash_kexec() does not return if a kdump kernel is loaded. If kdump kernel is not loaded, then crash_kexec() returns without doing anything. I think that explains why not making second call to crash_kexec() under if, did not create problems. In first case it will never be called and in second case, it will do nothing and simply return back. But anyway, we need your patch as that's right thing to do. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
* Vivek Goyal wrote: > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' > > was clearly not a no-op in the default case, against expectations. > > Hi Ingo, > > I did a quick test and in default case crash_kexec() runs before > panic notifiers. So it does look like crash_kexec_post_notifiers is > a no-op in default case. > > What am I missing. Well, look at f06e5153f4ae: diff --git a/kernel/panic.c b/kernel/panic.c index d02fa9fef46a..62e16cef9cc2 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -32,6 +32,7 @@ static unsigned long tainted_mask; static int pause_on_oops; static int pause_on_oops_flag; static DEFINE_SPINLOCK(pause_on_oops_lock); +static bool crash_kexec_post_notifiers; int panic_timeout = CONFIG_PANIC_TIMEOUT; EXPORT_SYMBOL_GPL(panic_timeout); @@ -112,9 +113,11 @@ void panic(const char *fmt, ...) /* * If we have crashed and we have a crash kernel loaded let it handle * everything else. -* Do we want to call this before we try to display a message? +* If we want to run this after calling panic_notifiers, pass +* the "crash_kexec_post_notifiers" option to the kernel. */ - crash_kexec(NULL); + if (!crash_kexec_post_notifiers) + crash_kexec(NULL); /* * Note smp_send_stop is the usual smp shutdown function, which @@ -131,6 +134,15 @@ void panic(const char *fmt, ...) kmsg_dump(KMSG_DUMP_PANIC); + /* +* If you doubt kdump always works fine in any situation, +* "crash_kexec_post_notifiers" offers you a chance to run +* panic_notifiers and dumping kmsg before kdump. +* Note: since some panic_notifiers can make crashed kernel +* more unstable, it can increase risks of the kdump failure too. +*/ + crash_kexec(NULL); + bust_spinlocks(0); if (!panic_blink) Without knowing what crash_kexec() does, the patch looks buggy: it should preserve the old behavior by default, yet it will now execute a second crash_kexec() after the kmsg_dump() line. So the invariant change would have been to do: - crash_kexec(NULL); + if (!crash_kexec_post_notifiers) + crash_kexec(NULL); ... + if (crash_kexec_post_notifiers) + crash_kexec(NULL); Which in the !crash_kexec_post_notifiers flag case reduces to: crash_kexec(); ... /* NOP */ I.e. to exactly what the kernel was doing without the patch originally. Which is what my patch does. Nothing more, nothing less. There might be other bugs with the patch, I didn't consider that. A revert would be fine as well. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Tue, Mar 24, 2015 at 08:11:29AM +0100, Ingo Molnar wrote: > > * Masami Hiramatsu wrote: > > > (2015/03/23 16:19), Ingo Molnar wrote: > > > > > > * Baoquan He wrote: > > > > > >> CC more people ... > > >> > > >> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > > >>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > > >>> "crash_kexec_post_notifiers" kernel boot option, which toggles > > >>> wheather panic() calls crash_kexec() before panic_notifiers and dump > > >>> kmsg or after. > > >>> > > >>> The problem is that the commit overlooks panic_on_oops kernel boot > > >>> option. If it is enabled, crash_kexec() is called directly without > > >>> going through panic() in oops path. > > >>> > > >>> To fix this issue, this patch adds a check to > > >>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > >>> > > >>> Also, put a comment in kexec_should_crash() to explain not obvious > > >>> things on this patch. > > >>> > > >>> Signed-off-by: HATAYAMA Daisuke > > >>> Acked-by: Baoquan He > > >>> Tested-by: Hidehiro Kawai > > >>> Reviewed-by: Masami Hiramatsu > > >>> --- > > >>> include/linux/kernel.h | 3 +++ > > >>> kernel/kexec.c | 11 +++ > > >>> kernel/panic.c | 2 +- > > >>> 3 files changed, 15 insertions(+), 1 deletion(-) > > > > > > This is hack upon hack, but why was this crap merged in the first > > > place? > > > > > > I see two problems just by cursory review: > > > > > > 1) > > > > > > Firstly, the real bug in: > > > > > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option > > > for kdump after panic_notifers") > > > > > > Was that crash_kexec() was called unconditionally after notifiers were > > > called, which should be fixed via the simple patch below (untested). > > > Looks much simpler than your fix. > > > > No, Daisuke's patch is not for that case. [...] > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was > clearly not a no-op in the default case, against expectations. Hi Ingo, I did a quick test and in default case crash_kexec() runs before panic notifiers. So it does look like crash_kexec_post_notifiers is a no-op in default case. What am I missing. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Tue, Mar 24, 2015 at 05:27:10AM -0500, Eric W. Biederman wrote: > Ingo Molnar writes: > > > * Masami Hiramatsu wrote: > > > >> > > >> > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option > >> > for kdump after panic_notifers") > >> > > >> > Was that crash_kexec() was called unconditionally after notifiers were > >> > called, which should be fixed via the simple patch below (untested). > >> > Looks much simpler than your fix. > >> > >> No, Daisuke's patch is not for that case. [...] > > > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was > > clearly not a no-op in the default case, against expectations. > > > > So the first step should be to restore the original behavior (my > > patch), then should any new tweaks be added. > > Honestly I think the proper fix is to simply revert f06e5153f4ae. > > It was clearly not properly tested by the people who wanted it because > they came back quite a while later with additional bleh. > > I think this pretty much counts as hitting the code doesn't work let's > remove it threshold. IMHO, we should give users flexibility of running panic notifiers before crash_kexec(). Different people have been asking for it since last 7-8 years and it is a pretty small code in kernel so no major maintenance headache. Agreed that this might be very unreliable, but if users want to shoot themseleves in the foot, it is their choice. This will not be upstream default and I am hoping that distributions don't make it their default either. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
Ingo Molnar writes: > * Masami Hiramatsu wrote: > >> > >> > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option >> > for kdump after panic_notifers") >> > >> > Was that crash_kexec() was called unconditionally after notifiers were >> > called, which should be fixed via the simple patch below (untested). >> > Looks much simpler than your fix. >> >> No, Daisuke's patch is not for that case. [...] > > Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was > clearly not a no-op in the default case, against expectations. > > So the first step should be to restore the original behavior (my > patch), then should any new tweaks be added. Honestly I think the proper fix is to simply revert f06e5153f4ae. It was clearly not properly tested by the people who wanted it because they came back quite a while later with additional bleh. I think this pretty much counts as hitting the code doesn't work let's remove it threshold. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
* Masami Hiramatsu wrote: > (2015/03/23 16:19), Ingo Molnar wrote: > > > > * Baoquan He wrote: > > > >> CC more people ... > >> > >> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > >>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > >>> "crash_kexec_post_notifiers" kernel boot option, which toggles > >>> wheather panic() calls crash_kexec() before panic_notifiers and dump > >>> kmsg or after. > >>> > >>> The problem is that the commit overlooks panic_on_oops kernel boot > >>> option. If it is enabled, crash_kexec() is called directly without > >>> going through panic() in oops path. > >>> > >>> To fix this issue, this patch adds a check to > >>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > >>> > >>> Also, put a comment in kexec_should_crash() to explain not obvious > >>> things on this patch. > >>> > >>> Signed-off-by: HATAYAMA Daisuke > >>> Acked-by: Baoquan He > >>> Tested-by: Hidehiro Kawai > >>> Reviewed-by: Masami Hiramatsu > >>> --- > >>> include/linux/kernel.h | 3 +++ > >>> kernel/kexec.c | 11 +++ > >>> kernel/panic.c | 2 +- > >>> 3 files changed, 15 insertions(+), 1 deletion(-) > > > > This is hack upon hack, but why was this crap merged in the first > > place? > > > > I see two problems just by cursory review: > > > > 1) > > > > Firstly, the real bug in: > > > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option > > for kdump after panic_notifers") > > > > Was that crash_kexec() was called unconditionally after notifiers were > > called, which should be fixed via the simple patch below (untested). > > Looks much simpler than your fix. > > No, Daisuke's patch is not for that case. [...] Yet the actual bug is in that commit, 'crash_kexec_post_notifiers' was clearly not a no-op in the default case, against expectations. So the first step should be to restore the original behavior (my patch), then should any new tweaks be added. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
(2015/03/23 23:31), Vivek Goyal wrote: [...] Secondly, and more importantly, the whole premise of commit f06e5153f4ae is broken IMHO: "This can help rare situations where kdump fails because of unstable crashed kernel or hardware failure (memory corruption on critical data/code)" wtf? If the kernel crashed due to a kernel crash, then the kernel booting up in whatever hardware state should be able to do a clean bootup. The fix for those 'rare situations' should be to fix the real bug (for example by making hardware driver init (or deinit) sequences more robust), not to paper it over by ordering around crash-time sequences ... If it crashed due to some hardware failure, there's literally an infinite amount of failure modes that may or may not be impacted by kexec crash-time handling ordering. We don't want to put a zillion such flags into the kernel proper just to allow the perturbation of the kernel. >>> >>> I think one of the motivations behind this patch was call to kmsg_dump(). >>> Some vendors have been wanting to have the capability to save kernel logs >>> to some NVRAM before transition to second kernel happens. Their argument >>> is that kdump does not succeed all the time and if kdump does not succeed >>> then atleast they have something to work with (kernel logs retrieved >>> from pstore interface). >> >> Doesn't pstore attach itself to printk itself? AFAICS it does: >> >> fs/pstore/platform.c: register_console(&pstore_console); >> >> so the printk log leading up to and including the crash should be >> available, regardless of this patch. What am I missing? > > That's a good point. I was not aware of it. I am Ccing Don Zickus as > he has spent some time on this in the past. > > Masami, would you have thougths on this? IIRC, one reason why kmsg_dump() > was written so that one could dump kernel messages to an NVRAM. Of one > could simple register pstore as console, then how kmsg_dump() will > continue to be useful? Yes, actually, kmsg_dump and pstore can help a lot to dump the last message (even though kmsg_dump() is called only when setting crash_kexec_post_notifiers...) However, there are some machines which don't support pstore, but only IPMI. pstore(kmsg) stores messages to a local NVRAM, and IPMI stores messages to BMC(Board Management Controller)'s NVRAM (SEL: System Event Log). Some enterprise servers only have BMC, but no NVRAM. For such kind of servers, we still need to call panic_notifier to store messages via IPMI. And also, using IPMI has another secondary feature, we can notice machine failure from remote machine via IPMI over LAN by monitoring SEL :) You might want to integrate IPMI and pstore. But since IPMI SEL is very limited and very slow, those are very different. >>> Not that I agree fully with this as problem might happen while we >>> try to run panic_notifiers or kmsg_dump hooks and never transition >>> into kdump kernel. >> >> btw., this is the big problem with 'notifiers' in general: they are >> opaque with barely any semantics defined, and a source of constant >> confusion. > > Agreed. That's the reason Eric never liked the idea of letting panic > notifiers run before crash_kexec(). I see. thus I added a notice on documentation. Note that this also increases risks of kdump failure, because some panic notifiers can make the crashed kernel more unstable. I personally don't recommend to use this in usual situation. Only for the machines which is very well configured and tested, this feature can be enabled. >>> And it has been literally years since some developers have been >>> pushing for allowing to run panic notifiers before crash_kexec(). >>> Eric Biederman has been pushing back saying it reduces the >>> reliability of kdump operation so this is not acceptable. >> >> So what do those notifiers do? > > IIRC, two main reasons had come in the past. > > - In a cluster of nodes, people wanted to send some sort of notifications > to main server that a node has crashed and don't fence it off as it > might be saving dump. > > - And saving kernel logs to non volatile store. > > There might be more and I might not be aware about these. Hatayama and > Masami, can you shed more light on this. Yes, as I described above, we'd like to use IPMI to write the log to SEL and that also allow us to monitor the machine remotely. > > BTW, first problem we faced in our clusters too and now it has been fixed. > Basically we send notifications in second kernel in user space to master > server that this node is still saving dump so don't fence it off. Yeah, that's the usual way, I think. In some "mission-critical" use-cases, we can't relay only on the kdump stability. Thank you, -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Lt
Re: Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
(2015/03/23 16:19), Ingo Molnar wrote: > > * Baoquan He wrote: > >> CC more people ... >> >> On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: >>> The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced >>> "crash_kexec_post_notifiers" kernel boot option, which toggles >>> wheather panic() calls crash_kexec() before panic_notifiers and dump >>> kmsg or after. >>> >>> The problem is that the commit overlooks panic_on_oops kernel boot >>> option. If it is enabled, crash_kexec() is called directly without >>> going through panic() in oops path. >>> >>> To fix this issue, this patch adds a check to >>> "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). >>> >>> Also, put a comment in kexec_should_crash() to explain not obvious >>> things on this patch. >>> >>> Signed-off-by: HATAYAMA Daisuke >>> Acked-by: Baoquan He >>> Tested-by: Hidehiro Kawai >>> Reviewed-by: Masami Hiramatsu >>> --- >>> include/linux/kernel.h | 3 +++ >>> kernel/kexec.c | 11 +++ >>> kernel/panic.c | 2 +- >>> 3 files changed, 15 insertions(+), 1 deletion(-) > > This is hack upon hack, but why was this crap merged in the first > place? > > I see two problems just by cursory review: > > 1) > > Firstly, the real bug in: > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for > kdump after panic_notifers") > > Was that crash_kexec() was called unconditionally after notifiers were > called, which should be fixed via the simple patch below (untested). > Looks much simpler than your fix. No, Daisuke's patch is not for that case. Since the kdump has a special hook in kernel oops, when both of panic_on_oops and crash_kernel are set, panic() is never called. Please see oops_end@arch/x86/kernel/dumpstack.c void oops_end(unsigned long flags, struct pt_regs *regs, int signr) { if (regs && kexec_should_crash(current)) crash_kexec(regs); Of course crash_kexec() never return except failing kexec unexpectedly. Thus, kexec_should_crash should returns 0 if crash_kexec_post_notifiers is set. (Semantically, it is a bit strange that panic_on_oops doesn't call panic(), but that is another topic.) However, your patch is also needed since the first crash_kexec() can fail in panic() when crash_kexec_post_notifiers is not set. In that case, kernel tries to call notifiers and call the 2nd crash_kexec() again. Actually the 2nd one is useless. So, here is my reviewed-by. Reviewed-by: Masami Hiramatsu I'll be reply the latter part in other mail. Thank you, > > 2) > > Secondly, and more importantly, the whole premise of commit > f06e5153f4ae is broken IMHO: > > "This can help rare situations where kdump fails because of unstable > crashed kernel or hardware failure (memory corruption on critical > data/code)" > > wtf? > > If the kernel crashed due to a kernel crash, then the kernel booting > up in whatever hardware state should be able to do a clean bootup. The > fix for those 'rare situations' should be to fix the real bug (for > example by making hardware driver init (or deinit) sequences more > robust), not to paper it over by ordering around crash-time sequences > ... > > If it crashed due to some hardware failure, there's literally an > infinite amount of failure modes that may or may not be impacted by > kexec crash-time handling ordering. We don't want to put a zillion > such flags into the kernel proper just to allow the perturbation of > the kernel. > > Thanks, > > Ingo > > diff --git a/kernel/panic.c b/kernel/panic.c > index 8136ad76e5fd..774614f72cbd 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -142,7 +142,8 @@ void panic(const char *fmt, ...) >* Note: since some panic_notifiers can make crashed kernel >* more unstable, it can increase risks of the kdump failure too. >*/ > - crash_kexec(NULL); > + if (crash_kexec_post_notifiers) > + crash_kexec(NULL); > > bust_spinlocks(0); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Masami HIRAMATSU Software Platform Research Dept. Linux Technology Research Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu...@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Mon, Mar 23, 2015 at 10:31:58AM -0400, Vivek Goyal wrote: > > > I think one of the motivations behind this patch was call to kmsg_dump(). > > > Some vendors have been wanting to have the capability to save kernel logs > > > to some NVRAM before transition to second kernel happens. Their argument > > > is that kdump does not succeed all the time and if kdump does not succeed > > > then atleast they have something to work with (kernel logs retrieved > > > from pstore interface). > > > > Doesn't pstore attach itself to printk itself? AFAICS it does: > > > > fs/pstore/platform.c: register_console(&pstore_console); > > > > so the printk log leading up to and including the crash should be > > available, regardless of this patch. What am I missing? > > That's a good point. I was not aware of it. I am Ccing Don Zickus as > he has spent some time on this in the past. Hi, I will throw my two cents in here, though I expect Daisuke to provide better info. A number of years ago when I was helping work through some of the birthing pains of the backend for pstore, we didn't have console support. I don't think it made sense for x86 either because: - lack of space in nvram (for large logs) - you could mark space for deletion, but space was only recovered on reboot - the state machine would be slow to write (though mmap might have been faster) Looking through the history of who introduced register_console, it looks like it was the ARM folks. They might have a better implementation for a backend that does not have the above limitations. I don't know. > > Masami, would you have thougths on this? IIRC, one reason why kmsg_dump() > was written so that one could dump kernel messages to an NVRAM. Of one > could simple register pstore as console, then how kmsg_dump() will > continue to be useful? > > > > > > Not that I agree fully with this as problem might happen while we > > > try to run panic_notifiers or kmsg_dump hooks and never transition > > > into kdump kernel. > > > > btw., this is the big problem with 'notifiers' in general: they are > > opaque with barely any semantics defined, and a source of constant > > confusion. > > Agreed. That's the reason Eric never liked the idea of letting panic > notifiers run before crash_kexec(). > > > > > > And it has been literally years since some developers have been > > > pushing for allowing to run panic notifiers before crash_kexec(). > > > Eric Biederman has been pushing back saying it reduces the > > > reliability of kdump operation so this is not acceptable. > > > > So what do those notifiers do? I think it was just philosophical differences. kexec on panic is a complicated path and a bunch of stuff could go wrong. As insurance in case kexec on panic did not work, some companies wanted to pre-maturely capture info, so they have _something_ to use for a debug analysis. Vivek, Matthew Garret and myself argued to provide us with failure cases and we will fix kexec on panic instead. I think the stability period for kexec on panic was so long that companies still do not trust it. Just my guess. Vivek provided examples of what folks were doing with the notifiers. > > IIRC, two main reasons had come in the past. > > - In a cluster of nodes, people wanted to send some sort of notifications > to main server that a node has crashed and don't fence it off as it > might be saving dump. > > - And saving kernel logs to non volatile store. > > There might be more and I might not be aware about these. Hatayama and > Masami, can you shed more light on this. Cheers, Don -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote: > > * Baoquan He wrote: > > > CC more people ... > > > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > > > "crash_kexec_post_notifiers" kernel boot option, which toggles > > > wheather panic() calls crash_kexec() before panic_notifiers and dump > > > kmsg or after. > > > > > > The problem is that the commit overlooks panic_on_oops kernel boot > > > option. If it is enabled, crash_kexec() is called directly without > > > going through panic() in oops path. > > > > > > To fix this issue, this patch adds a check to > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > > > > > Also, put a comment in kexec_should_crash() to explain not obvious > > > things on this patch. > > > > > > Signed-off-by: HATAYAMA Daisuke > > > Acked-by: Baoquan He > > > Tested-by: Hidehiro Kawai > > > Reviewed-by: Masami Hiramatsu > > > --- > > > include/linux/kernel.h | 3 +++ > > > kernel/kexec.c | 11 +++ > > > kernel/panic.c | 2 +- > > > 3 files changed, 15 insertions(+), 1 deletion(-) > > This is hack upon hack, but why was this crap merged in the first > place? > > I see two problems just by cursory review: > > 1) > > Firstly, the real bug in: > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for > kdump after panic_notifers") > > Was that crash_kexec() was called unconditionally after notifiers were > called, which should be fixed via the simple patch below (untested). > Looks much simpler than your fix. > > 2) > > Secondly, and more importantly, the whole premise of commit > f06e5153f4ae is broken IMHO: > > "This can help rare situations where kdump fails because of unstable > crashed kernel or hardware failure (memory corruption on critical > data/code)" > > wtf? > > If the kernel crashed due to a kernel crash, then the kernel booting > up in whatever hardware state should be able to do a clean bootup. The > fix for those 'rare situations' should be to fix the real bug (for > example by making hardware driver init (or deinit) sequences more > robust), not to paper it over by ordering around crash-time sequences > ... > > If it crashed due to some hardware failure, there's literally an > infinite amount of failure modes that may or may not be impacted by > kexec crash-time handling ordering. We don't want to put a zillion > such flags into the kernel proper just to allow the perturbation of > the kernel. > > Thanks, > > Ingo > I quickly tested this patch to make sure I can still transition into second kernel when crash_kexec_post_notifiers is specified on command line. I have not tried running any notifiers. So. Tested-by: Vivek Goyal Acked-by: Vivek Goyal This should be a general fix and not a replacement for the patch in question in this mail thread. Thanks Vivek > diff --git a/kernel/panic.c b/kernel/panic.c > index 8136ad76e5fd..774614f72cbd 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -142,7 +142,8 @@ void panic(const char *fmt, ...) >* Note: since some panic_notifiers can make crashed kernel >* more unstable, it can increase risks of the kdump failure too. >*/ > - crash_kexec(NULL); > + if (crash_kexec_post_notifiers) > + crash_kexec(NULL); > > bust_spinlocks(0); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Mon, Mar 23, 2015 at 02:50:46PM +0100, Ingo Molnar wrote: > > * Vivek Goyal wrote: > > > On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote: > > > > > > * Baoquan He wrote: > > > > > > > CC more people ... > > > > > > > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > > > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > > > > > "crash_kexec_post_notifiers" kernel boot option, which toggles > > > > > wheather panic() calls crash_kexec() before panic_notifiers and dump > > > > > kmsg or after. > > > > > > > > > > The problem is that the commit overlooks panic_on_oops kernel boot > > > > > option. If it is enabled, crash_kexec() is called directly without > > > > > going through panic() in oops path. > > > > > > > > > > To fix this issue, this patch adds a check to > > > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > > > > > > > > > Also, put a comment in kexec_should_crash() to explain not obvious > > > > > things on this patch. > > > > > > > > > > Signed-off-by: HATAYAMA Daisuke > > > > > Acked-by: Baoquan He > > > > > Tested-by: Hidehiro Kawai > > > > > Reviewed-by: Masami Hiramatsu > > > > > --- > > > > > include/linux/kernel.h | 3 +++ > > > > > kernel/kexec.c | 11 +++ > > > > > kernel/panic.c | 2 +- > > > > > 3 files changed, 15 insertions(+), 1 deletion(-) > > > > > > This is hack upon hack, but why was this crap merged in the first > > > place? > > > > > > I see two problems just by cursory review: > > > > > > 1) > > > > > > Firstly, the real bug in: > > > > > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option > > > for kdump after panic_notifers") > > > > > > Was that crash_kexec() was called unconditionally after notifiers were > > > called, which should be fixed via the simple patch below (untested). > > > Looks much simpler than your fix. > > > > > > > Hi Ingo, > > > > Agreed. Your patch looks good. > > In case you want that simpler fix and need my SOB: > > Signed-off-by: Ingo Molnar > > (but I have not tested it.) I will quickly test it. So this is a general fix but not a replacement for fix in this patch? Because the problem original patch is trying to fix is that crash_kexec() can be called from outside panic() too (kexec_should_crash()) and in that case panic notifiers will not be called. So this patch is just trying to delay the call to crash_kexec() to make it run much later. > > > > Secondly, and more importantly, the whole premise of commit > > > f06e5153f4ae is broken IMHO: > > > > > > "This can help rare situations where kdump fails because of unstable > > > crashed kernel or hardware failure (memory corruption on critical > > > data/code)" > > > > > > wtf? > > > > > > If the kernel crashed due to a kernel crash, then the kernel booting > > > up in whatever hardware state should be able to do a clean bootup. The > > > fix for those 'rare situations' should be to fix the real bug (for > > > example by making hardware driver init (or deinit) sequences more > > > robust), not to paper it over by ordering around crash-time sequences > > > ... > > > > > > If it crashed due to some hardware failure, there's literally an > > > infinite amount of failure modes that may or may not be impacted by > > > kexec crash-time handling ordering. We don't want to put a zillion > > > such flags into the kernel proper just to allow the perturbation of > > > the kernel. > > > > I think one of the motivations behind this patch was call to kmsg_dump(). > > Some vendors have been wanting to have the capability to save kernel logs > > to some NVRAM before transition to second kernel happens. Their argument > > is that kdump does not succeed all the time and if kdump does not succeed > > then atleast they have something to work with (kernel logs retrieved > > from pstore interface). > > Doesn't pstore attach itself to printk itself? AFAICS it does: > > fs/pstore/platform.c: register_console(&pstore_console); > > so the printk log leading up to and including the crash should be > available, regardless of this patch. What am I missing? That's a good point. I was not aware of it. I am Ccing Don Zickus as he has spent some time on this in the past. Masami, would you have thougths on this? IIRC, one reason why kmsg_dump() was written so that one could dump kernel messages to an NVRAM. Of one could simple register pstore as console, then how kmsg_dump() will continue to be useful? > > > Not that I agree fully with this as problem might happen while we > > try to run panic_notifiers or kmsg_dump hooks and never transition > > into kdump kernel. > > btw., this is the big problem with 'notifiers' in general: they are > opaque with barely any semantics defined, and a source of constant > confusion. Agreed. That's the reason Eric never liked the idea of letting panic notifiers run before crash_kexec(). > > > And it has be
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
* Vivek Goyal wrote: > On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote: > > > > * Baoquan He wrote: > > > > > CC more people ... > > > > > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > > > > "crash_kexec_post_notifiers" kernel boot option, which toggles > > > > wheather panic() calls crash_kexec() before panic_notifiers and dump > > > > kmsg or after. > > > > > > > > The problem is that the commit overlooks panic_on_oops kernel boot > > > > option. If it is enabled, crash_kexec() is called directly without > > > > going through panic() in oops path. > > > > > > > > To fix this issue, this patch adds a check to > > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > > > > > > > Also, put a comment in kexec_should_crash() to explain not obvious > > > > things on this patch. > > > > > > > > Signed-off-by: HATAYAMA Daisuke > > > > Acked-by: Baoquan He > > > > Tested-by: Hidehiro Kawai > > > > Reviewed-by: Masami Hiramatsu > > > > --- > > > > include/linux/kernel.h | 3 +++ > > > > kernel/kexec.c | 11 +++ > > > > kernel/panic.c | 2 +- > > > > 3 files changed, 15 insertions(+), 1 deletion(-) > > > > This is hack upon hack, but why was this crap merged in the first > > place? > > > > I see two problems just by cursory review: > > > > 1) > > > > Firstly, the real bug in: > > > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option > > for kdump after panic_notifers") > > > > Was that crash_kexec() was called unconditionally after notifiers were > > called, which should be fixed via the simple patch below (untested). > > Looks much simpler than your fix. > > > > Hi Ingo, > > Agreed. Your patch looks good. In case you want that simpler fix and need my SOB: Signed-off-by: Ingo Molnar (but I have not tested it.) > > Secondly, and more importantly, the whole premise of commit > > f06e5153f4ae is broken IMHO: > > > > "This can help rare situations where kdump fails because of unstable > > crashed kernel or hardware failure (memory corruption on critical > > data/code)" > > > > wtf? > > > > If the kernel crashed due to a kernel crash, then the kernel booting > > up in whatever hardware state should be able to do a clean bootup. The > > fix for those 'rare situations' should be to fix the real bug (for > > example by making hardware driver init (or deinit) sequences more > > robust), not to paper it over by ordering around crash-time sequences > > ... > > > > If it crashed due to some hardware failure, there's literally an > > infinite amount of failure modes that may or may not be impacted by > > kexec crash-time handling ordering. We don't want to put a zillion > > such flags into the kernel proper just to allow the perturbation of > > the kernel. > > I think one of the motivations behind this patch was call to kmsg_dump(). > Some vendors have been wanting to have the capability to save kernel logs > to some NVRAM before transition to second kernel happens. Their argument > is that kdump does not succeed all the time and if kdump does not succeed > then atleast they have something to work with (kernel logs retrieved > from pstore interface). Doesn't pstore attach itself to printk itself? AFAICS it does: fs/pstore/platform.c: register_console(&pstore_console); so the printk log leading up to and including the crash should be available, regardless of this patch. What am I missing? > Not that I agree fully with this as problem might happen while we > try to run panic_notifiers or kmsg_dump hooks and never transition > into kdump kernel. btw., this is the big problem with 'notifiers' in general: they are opaque with barely any semantics defined, and a source of constant confusion. > And it has been literally years since some developers have been > pushing for allowing to run panic notifiers before crash_kexec(). > Eric Biederman has been pushing back saying it reduces the > reliability of kdump operation so this is not acceptable. So what do those notifiers do? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Mon, Mar 23, 2015 at 08:19:43AM +0100, Ingo Molnar wrote: > > * Baoquan He wrote: > > > CC more people ... > > > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > > > "crash_kexec_post_notifiers" kernel boot option, which toggles > > > wheather panic() calls crash_kexec() before panic_notifiers and dump > > > kmsg or after. > > > > > > The problem is that the commit overlooks panic_on_oops kernel boot > > > option. If it is enabled, crash_kexec() is called directly without > > > going through panic() in oops path. > > > > > > To fix this issue, this patch adds a check to > > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > > > > > Also, put a comment in kexec_should_crash() to explain not obvious > > > things on this patch. > > > > > > Signed-off-by: HATAYAMA Daisuke > > > Acked-by: Baoquan He > > > Tested-by: Hidehiro Kawai > > > Reviewed-by: Masami Hiramatsu > > > --- > > > include/linux/kernel.h | 3 +++ > > > kernel/kexec.c | 11 +++ > > > kernel/panic.c | 2 +- > > > 3 files changed, 15 insertions(+), 1 deletion(-) > > This is hack upon hack, but why was this crap merged in the first > place? > > I see two problems just by cursory review: > > 1) > > Firstly, the real bug in: > > f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for > kdump after panic_notifers") > > Was that crash_kexec() was called unconditionally after notifiers were > called, which should be fixed via the simple patch below (untested). > Looks much simpler than your fix. > Hi Ingo, Agreed. Your patch looks good. > 2) > > Secondly, and more importantly, the whole premise of commit > f06e5153f4ae is broken IMHO: > > "This can help rare situations where kdump fails because of unstable > crashed kernel or hardware failure (memory corruption on critical > data/code)" > > wtf? > > If the kernel crashed due to a kernel crash, then the kernel booting > up in whatever hardware state should be able to do a clean bootup. The > fix for those 'rare situations' should be to fix the real bug (for > example by making hardware driver init (or deinit) sequences more > robust), not to paper it over by ordering around crash-time sequences > ... > > If it crashed due to some hardware failure, there's literally an > infinite amount of failure modes that may or may not be impacted by > kexec crash-time handling ordering. We don't want to put a zillion > such flags into the kernel proper just to allow the perturbation of > the kernel. I think one of the motivations behind this patch was call to kmsg_dump(). Some vendors have been wanting to have the capability to save kernel logs to some NVRAM before transition to second kernel happens. Their argument is that kdump does not succeed all the time and if kdump does not succeed then atleast they have something to work with (kernel logs retrieved from pstore interface). Not that I agree fully with this as problem might happen while we try to run panic_notifiers or kmsg_dump hooks and never transition into kdump kernel. And it has been literally years since some developers have been pushing for allowing to run panic notifiers before crash_kexec(). Eric Biederman has been pushing back saying it reduces the reliability of kdump operation so this is not acceptable. So while it is very hacky, this command line option was intorduced which allowed to override default crash_kexec() behavior and those who want to do additional things (at their own risk) before transition to second kernel, can specify this parameter. Thanks Vivek > > diff --git a/kernel/panic.c b/kernel/panic.c > index 8136ad76e5fd..774614f72cbd 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -142,7 +142,8 @@ void panic(const char *fmt, ...) >* Note: since some panic_notifiers can make crashed kernel >* more unstable, it can increase risks of the kdump failure too. >*/ > - crash_kexec(NULL); > + if (crash_kexec_post_notifiers) > + crash_kexec(NULL); > > bust_spinlocks(0); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
* Baoquan He wrote: > CC more people ... > > On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > > "crash_kexec_post_notifiers" kernel boot option, which toggles > > wheather panic() calls crash_kexec() before panic_notifiers and dump > > kmsg or after. > > > > The problem is that the commit overlooks panic_on_oops kernel boot > > option. If it is enabled, crash_kexec() is called directly without > > going through panic() in oops path. > > > > To fix this issue, this patch adds a check to > > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > > > Also, put a comment in kexec_should_crash() to explain not obvious > > things on this patch. > > > > Signed-off-by: HATAYAMA Daisuke > > Acked-by: Baoquan He > > Tested-by: Hidehiro Kawai > > Reviewed-by: Masami Hiramatsu > > --- > > include/linux/kernel.h | 3 +++ > > kernel/kexec.c | 11 +++ > > kernel/panic.c | 2 +- > > 3 files changed, 15 insertions(+), 1 deletion(-) This is hack upon hack, but why was this crap merged in the first place? I see two problems just by cursory review: 1) Firstly, the real bug in: f06e5153f4ae ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers") Was that crash_kexec() was called unconditionally after notifiers were called, which should be fixed via the simple patch below (untested). Looks much simpler than your fix. 2) Secondly, and more importantly, the whole premise of commit f06e5153f4ae is broken IMHO: "This can help rare situations where kdump fails because of unstable crashed kernel or hardware failure (memory corruption on critical data/code)" wtf? If the kernel crashed due to a kernel crash, then the kernel booting up in whatever hardware state should be able to do a clean bootup. The fix for those 'rare situations' should be to fix the real bug (for example by making hardware driver init (or deinit) sequences more robust), not to paper it over by ordering around crash-time sequences ... If it crashed due to some hardware failure, there's literally an infinite amount of failure modes that may or may not be impacted by kexec crash-time handling ordering. We don't want to put a zillion such flags into the kernel proper just to allow the perturbation of the kernel. Thanks, Ingo diff --git a/kernel/panic.c b/kernel/panic.c index 8136ad76e5fd..774614f72cbd 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -142,7 +142,8 @@ void panic(const char *fmt, ...) * Note: since some panic_notifiers can make crashed kernel * more unstable, it can increase risks of the kdump failure too. */ - crash_kexec(NULL); + if (crash_kexec_post_notifiers) + crash_kexec(NULL); bust_spinlocks(0); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
CC more people ... On 03/07/15 at 01:31am, "Hatayama, Daisuke/畑山 大輔" wrote: > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > "crash_kexec_post_notifiers" kernel boot option, which toggles > wheather panic() calls crash_kexec() before panic_notifiers and dump > kmsg or after. > > The problem is that the commit overlooks panic_on_oops kernel boot > option. If it is enabled, crash_kexec() is called directly without > going through panic() in oops path. > > To fix this issue, this patch adds a check to > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > Also, put a comment in kexec_should_crash() to explain not obvious > things on this patch. > > Signed-off-by: HATAYAMA Daisuke > Acked-by: Baoquan He > Tested-by: Hidehiro Kawai > Reviewed-by: Masami Hiramatsu > --- > include/linux/kernel.h | 3 +++ > kernel/kexec.c | 11 +++ > kernel/panic.c | 2 +- > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index d6d630d..07483c7 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi; > extern int panic_on_io_nmi; > extern int panic_on_warn; > extern int sysctl_panic_on_stackoverflow; > + > +extern bool crash_kexec_post_notifiers; > + > /* > * Only to be used by arch init code. If the user over-wrote the default > * CONFIG_PANIC_TIMEOUT, honor it. > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 38c25b1..5bf6077 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -84,6 +84,17 @@ struct resource crashk_low_res = { > > int kexec_should_crash(struct task_struct *p) > { > + /* > + * If crash_kexec_post_notifiers is enabled, don't run > + * crash_kexec() here yet, which must be run after panic > + * notifiers in panic(). > + */ > + if (crash_kexec_post_notifiers) > + return 0; > + /* > + * There are 4 panic() calls in do_exit() path, each of which > + * calls corresponds to each of these 4 conditions. > + */ > if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops) > return 1; > return 0; > diff --git a/kernel/panic.c b/kernel/panic.c > index 8136ad7..79ca912 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -32,7 +32,7 @@ static unsigned long tainted_mask; > static int pause_on_oops; > static int pause_on_oops_flag; > static DEFINE_SPINLOCK(pause_on_oops_lock); > -static bool crash_kexec_post_notifiers; > +bool crash_kexec_post_notifiers; > int panic_on_warn __read_mostly; > > int panic_timeout = CONFIG_PANIC_TIMEOUT; > -- > 1.9.3 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
On Sat, Mar 07, 2015 at 01:31:01AM +0900, "Hatayama, Daisuke/畑山 大輔" wrote: > The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced > "crash_kexec_post_notifiers" kernel boot option, which toggles > wheather panic() calls crash_kexec() before panic_notifiers and dump > kmsg or after. > > The problem is that the commit overlooks panic_on_oops kernel boot > option. If it is enabled, crash_kexec() is called directly without > going through panic() in oops path. > > To fix this issue, this patch adds a check to > "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). > > Also, put a comment in kexec_should_crash() to explain not obvious > things on this patch. > > Signed-off-by: HATAYAMA Daisuke > Acked-by: Baoquan He > Tested-by: Hidehiro Kawai > Reviewed-by: Masami Hiramatsu Looks good to me. Acked-by: Vivek Goyal Thanks Vivek > --- > include/linux/kernel.h | 3 +++ > kernel/kexec.c | 11 +++ > kernel/panic.c | 2 +- > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index d6d630d..07483c7 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi; > extern int panic_on_io_nmi; > extern int panic_on_warn; > extern int sysctl_panic_on_stackoverflow; > + > +extern bool crash_kexec_post_notifiers; > + > /* > * Only to be used by arch init code. If the user over-wrote the default > * CONFIG_PANIC_TIMEOUT, honor it. > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 38c25b1..5bf6077 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -84,6 +84,17 @@ struct resource crashk_low_res = { > > int kexec_should_crash(struct task_struct *p) > { > + /* > + * If crash_kexec_post_notifiers is enabled, don't run > + * crash_kexec() here yet, which must be run after panic > + * notifiers in panic(). > + */ > + if (crash_kexec_post_notifiers) > + return 0; > + /* > + * There are 4 panic() calls in do_exit() path, each of which > + * calls corresponds to each of these 4 conditions. > + */ > if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops) > return 1; > return 0; > diff --git a/kernel/panic.c b/kernel/panic.c > index 8136ad7..79ca912 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -32,7 +32,7 @@ static unsigned long tainted_mask; > static int pause_on_oops; > static int pause_on_oops_flag; > static DEFINE_SPINLOCK(pause_on_oops_lock); > -static bool crash_kexec_post_notifiers; > +bool crash_kexec_post_notifiers; > int panic_on_warn __read_mostly; > > int panic_timeout = CONFIG_PANIC_TIMEOUT; > -- > 1.9.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] kernel/panic/kexec: fix "crash_kexec_post_notifiers" option issue in oops path
The commit f06e5153f4ae2e2f3b0300f0e260e40cb7fefd45 introduced "crash_kexec_post_notifiers" kernel boot option, which toggles wheather panic() calls crash_kexec() before panic_notifiers and dump kmsg or after. The problem is that the commit overlooks panic_on_oops kernel boot option. If it is enabled, crash_kexec() is called directly without going through panic() in oops path. To fix this issue, this patch adds a check to "crash_kexec_post_notifiers" in the condition of kexec_should_crash(). Also, put a comment in kexec_should_crash() to explain not obvious things on this patch. Signed-off-by: HATAYAMA Daisuke Acked-by: Baoquan He Tested-by: Hidehiro Kawai Reviewed-by: Masami Hiramatsu --- include/linux/kernel.h | 3 +++ kernel/kexec.c | 11 +++ kernel/panic.c | 2 +- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index d6d630d..07483c7 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -426,6 +426,9 @@ extern int panic_on_unrecovered_nmi; extern int panic_on_io_nmi; extern int panic_on_warn; extern int sysctl_panic_on_stackoverflow; + +extern bool crash_kexec_post_notifiers; + /* * Only to be used by arch init code. If the user over-wrote the default * CONFIG_PANIC_TIMEOUT, honor it. diff --git a/kernel/kexec.c b/kernel/kexec.c index 38c25b1..5bf6077 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -84,6 +84,17 @@ struct resource crashk_low_res = { int kexec_should_crash(struct task_struct *p) { + /* +* If crash_kexec_post_notifiers is enabled, don't run +* crash_kexec() here yet, which must be run after panic +* notifiers in panic(). +*/ + if (crash_kexec_post_notifiers) + return 0; + /* +* There are 4 panic() calls in do_exit() path, each of which +* calls corresponds to each of these 4 conditions. +*/ if (in_interrupt() || !p->pid || is_global_init(p) || panic_on_oops) return 1; return 0; diff --git a/kernel/panic.c b/kernel/panic.c index 8136ad7..79ca912 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -32,7 +32,7 @@ static unsigned long tainted_mask; static int pause_on_oops; static int pause_on_oops_flag; static DEFINE_SPINLOCK(pause_on_oops_lock); -static bool crash_kexec_post_notifiers; +bool crash_kexec_post_notifiers; int panic_on_warn __read_mostly; int panic_timeout = CONFIG_PANIC_TIMEOUT; -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/