RE: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
> What I'm planning to do in the altera_edac notifier is: > > if (kdump_is_set) > return; Yes. That's what I think should happen. -Tony
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On 17/05/2022 14:02, Luck, Tony wrote: >> Tony / Dinh - can I just *skip* this notifier *if kdump* is set or else >> we run the code as-is? Does that make sense to you? > > The "skip" option sounds like it needs some special flag associated with > an entry on the notifier chain. But there are other notifier chains ... so > that > sounds messy to me. > > Just all the notifiers in priority order. If any want to take different > actions > based on kdump status, change the code. That seems more flexible than > an "all or nothing" approach by skipping. > > -Tony I guess I've expressed myself in a poor way - sorry! What I'm planning to do in the altera_edac notifier is: if (kdump_is_set) return; /* regular code */ In other words: if the kdump is set, this notifier will be effectively a nop (although it's gonna be called). Lemme know your thoughts Tony, if that makes sense. Thanks, Guilherme
RE: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
> Tony / Dinh - can I just *skip* this notifier *if kdump* is set or else > we run the code as-is? Does that make sense to you? The "skip" option sounds like it needs some special flag associated with an entry on the notifier chain. But there are other notifier chains ... so that sounds messy to me. Just all the notifiers in priority order. If any want to take different actions based on kdump status, change the code. That seems more flexible than an "all or nothing" approach by skipping. -Tony
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On 17/05/2022 11:11, Petr Mladek wrote: > [...] >>> Then notifiers could make an informed choice on whether to deep dive to >>> get all the possible details (when there is no kdump) or just skim the high >>> level stuff (to maximize chance of getting a successful kdump). >>> >>> -Tony >> >> Good idea Tony! What if I wire a kexec_crash_loaded() in the notifier? > > I like this idea. > > One small problem is that kexec_crash_loaded() has valid result > only under kexec_mutex. On the other hand, it should stay true > once loaded so that the small race window should be innocent. > >> With that, are you/Petr/Dinh OK in moving it for the info list? > > Sounds good to me. > > Best Regards, > Petr Perfect, I'll do that for V2 then =) Tony / Dinh - can I just *skip* this notifier *if kdump* is set or else we run the code as-is? Does that make sense to you? I'll postpone it to run almost in the end of info list (last position is for panic_print). Thanks, Guilherme
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On 16/05/2022 13:18, Luck, Tony wrote: >> [...] > Would it be possible to have some global "kdump is configured + enabled" flag? > > Then notifiers could make an informed choice on whether to deep dive to > get all the possible details (when there is no kdump) or just skim the high > level stuff (to maximize chance of getting a successful kdump). > > -Tony Good idea Tony! What if I wire a kexec_crash_loaded() in the notifier? With that, are you/Petr/Dinh OK in moving it for the info list? Cheers, Guilherme
RE: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
> So, my reasoning here is: this notifier should fit the info list, > definitely! But...it's very high risk for kdump. It deep dives into the > regmap API (there are locks in such code) plus there is an (MM)IO write > to the device and an ARM firmware call. So, despite the nature of this > notifier _fits the informational list_, the _code is risky_ so we should > avoid running it before a kdump. > > Now, we indeed have a chicken/egg problem: want to avoid it before > kdump, BUT in case kdump is not set, kmsg_dump() (and console flushing, > after your suggestion Petr) will run before it and not save collected > information from EDAC PoV. Would it be possible to have some global "kdump is configured + enabled" flag? Then notifiers could make an informed choice on whether to deep dive to get all the possible details (when there is no kdump) or just skim the high level stuff (to maximize chance of getting a successful kdump). -Tony
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
Thanks again for the review! Comments inline below: On 16/05/2022 11:33, Petr Mladek wrote: > [...] >> --- a/drivers/edac/altera_edac.c >> +++ b/drivers/edac/altera_edac.c >> @@ -2163,7 +2162,7 @@ static int altr_edac_a10_probe(struct platform_device >> *pdev) >> int dberror, err_addr; >> >> edac->panic_notifier.notifier_call = s10_edac_dberr_handler; >> -atomic_notifier_chain_register(&panic_notifier_list, >> +atomic_notifier_chain_register(&panic_pre_reboot_list, > > My understanding is that this notifier first prints info about ECC > errors and then triggers reboot. It might make sense to split it > into two notifiers. I disagree here - looping the maintainers for comments (CCing Dinh / Tony). BTW, sorry for not having you on CC already Dinh, it was my mistake. So, my reasoning here is: this notifier should fit the info list, definitely! But...it's very high risk for kdump. It deep dives into the regmap API (there are locks in such code) plus there is an (MM)IO write to the device and an ARM firmware call. So, despite the nature of this notifier _fits the informational list_, the _code is risky_ so we should avoid running it before a kdump. Now, we indeed have a chicken/egg problem: want to avoid it before kdump, BUT in case kdump is not set, kmsg_dump() (and console flushing, after your suggestion Petr) will run before it and not save collected information from EDAC PoV. My idea: I could call a second kmsg_dump() or at least a panic console flush for within such notifier. Let me know what you think Petr (also Dinh / Tony and all interested parties). > [...] >> --- a/drivers/leds/trigger/ledtrig-panic.c >> +++ b/drivers/leds/trigger/ledtrig-panic.c >> @@ -64,7 +63,7 @@ static long led_panic_blink(int state) >> >> static int __init ledtrig_panic_init(void) >> { >> -atomic_notifier_chain_register(&panic_notifier_list, >> +atomic_notifier_chain_register(&panic_pre_reboot_list, >> &led_trigger_panic_nb); > > Blinking => should go to the last "post_reboot/loop" list. > [...] >> --- a/drivers/misc/ibmasm/heartbeat.c >> +++ b/drivers/misc/ibmasm/heartbeat.c >> @@ -32,20 +31,23 @@ static int suspend_heartbeats = 0; >> static int panic_happened(struct notifier_block *n, unsigned long val, void >> *v) >> { >> suspend_heartbeats = 1; >> -return 0; >> +return NOTIFY_DONE; >> } >> >> -static struct notifier_block panic_notifier = { panic_happened, NULL, 1 }; >> +static struct notifier_block panic_notifier = { >> +.notifier_call = panic_happened, >> +}; >> >> void ibmasm_register_panic_notifier(void) >> { >> -atomic_notifier_chain_register(&panic_notifier_list, &panic_notifier); >> +atomic_notifier_chain_register(&panic_pre_reboot_list, >> +&panic_notifier); > > Same here. Blinking => should go to the last "post_reboot/loop" list. Ack on both. IBMasm is not blinking IIUC, but still fits properly the loop list. This notifier would make a heartbeat mechanism stop, and once it's stopped, service processor is allowed to reboot - that's my understanding. Cheers, Guilherme
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On 29/04/2022 13:04, Max Filippov wrote: > [...] >> arch/xtensa/platforms/iss/setup.c | 4 ++--For xtensa: > > For xtensa: > Acked-by: Max Filippov > Perfect, thanks Max! Cheers, Guilherme
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On Wed, Apr 27, 2022 at 3:55 PM Guilherme G. Piccoli wrote: > > This patch renames the panic_notifier_list to panic_pre_reboot_list; > the idea is that a subsequent patch will refactor the panic path > in order to better split the notifiers, running some of them very > early, some of them not so early [but still before kmsg_dump()] and > finally, the rest should execute late, after kdump. The latter ones > are now in the panic pre-reboot list - the name comes from the idea > that these notifiers execute before panic() attempts rebooting the > machine (if that option is set). > > We also took the opportunity to clean-up useless header inclusions, > improve some notifier block declarations (e.g. in ibmasm/heartbeat.c) > and more important, change some priorities - we hereby set 2 notifiers > to run late in the list [iss_panic_event() and the IPMI panic_event()] > due to the risks they offer (may not return, for example). > Proper documentation is going to be provided in a subsequent patch, > that effectively refactors the panic path. [...] > arch/xtensa/platforms/iss/setup.c | 4 ++--For xtensa: For xtensa: Acked-by: Max Filippov -- Thanks. -- Max
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On 28/04/2022 13:26, Corey Minyard wrote: > [...] > > For the IPMI portion: > > Acked-by: Corey Minyard Thanks Alex and Corey for the ACKs! > > Note that the IPMI panic_event() should always return, but it may take > some time, especially if the IPMI controller is no longer functional. > So the risk of a long delay is there and it makes sense to move it very > late. > Thanks, I agree - the patch moves it to the (latest - 1) position, since some arch code might run as the latest and effectively stops the machine. Cheers, Guilherme
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On Wed, Apr 27, 2022 at 07:49:15PM -0300, Guilherme G. Piccoli wrote: > This patch renames the panic_notifier_list to panic_pre_reboot_list; > the idea is that a subsequent patch will refactor the panic path > in order to better split the notifiers, running some of them very > early, some of them not so early [but still before kmsg_dump()] and > finally, the rest should execute late, after kdump. The latter ones > are now in the panic pre-reboot list - the name comes from the idea > that these notifiers execute before panic() attempts rebooting the > machine (if that option is set). > > We also took the opportunity to clean-up useless header inclusions, > improve some notifier block declarations (e.g. in ibmasm/heartbeat.c) > and more important, change some priorities - we hereby set 2 notifiers > to run late in the list [iss_panic_event() and the IPMI panic_event()] > due to the risks they offer (may not return, for example). > Proper documentation is going to be provided in a subsequent patch, > that effectively refactors the panic path. For the IPMI portion: Acked-by: Corey Minyard Note that the IPMI panic_event() should always return, but it may take some time, especially if the IPMI controller is no longer functional. So the risk of a long delay is there and it makes sense to move it very late. -corey > > Cc: Alex Elder > Cc: Alexander Gordeev > Cc: Anton Ivanov > Cc: Benjamin Herrenschmidt > Cc: Bjorn Andersson > Cc: Boris Ostrovsky > Cc: Chris Zankel > Cc: Christian Borntraeger > Cc: Corey Minyard > Cc: Dexuan Cui > Cc: "H. Peter Anvin" > Cc: Haiyang Zhang > Cc: Heiko Carstens > Cc: Helge Deller > Cc: Ivan Kokshaysky > Cc: "James E.J. Bottomley" > Cc: James Morse > Cc: Johannes Berg > Cc: Juergen Gross > Cc: "K. Y. Srinivasan" > Cc: Mathieu Poirier > Cc: Matt Turner > Cc: Mauro Carvalho Chehab > Cc: Max Filippov > Cc: Michael Ellerman > Cc: Paul Mackerras > Cc: Pavel Machek > Cc: Richard Henderson > Cc: Richard Weinberger > Cc: Robert Richter > Cc: Stefano Stabellini > Cc: Stephen Hemminger > Cc: Sven Schnelle > Cc: Tony Luck > Cc: Vasily Gorbik > Cc: Wei Liu > Signed-off-by: Guilherme G. Piccoli > --- > > Notice that, with this name change, out-of-tree code that relies in the global > exported "panic_notifier_list" will fail to build. We could easily keep the > retro-compatibility by making the old symbol to still exist and point to the > pre_reboot list (or even, keep the old naming). > > But our design choice was to allow the breakage, making users rethink their > notifiers, adding them in the list that fits best. If that wasn't a good > decision, we're open to change it, of course. > Thanks in advance for the review! > > arch/alpha/kernel/setup.c | 4 ++-- > arch/parisc/kernel/pdc_chassis.c | 3 +-- > arch/powerpc/kernel/setup-common.c| 2 +- > arch/s390/kernel/ipl.c| 4 ++-- > arch/um/drivers/mconsole_kern.c | 2 +- > arch/um/kernel/um_arch.c | 2 +- > arch/x86/xen/enlighten.c | 2 +- > arch/xtensa/platforms/iss/setup.c | 4 ++-- > drivers/char/ipmi/ipmi_msghandler.c | 12 +++- > drivers/edac/altera_edac.c| 3 +-- > drivers/hv/vmbus_drv.c| 4 ++-- > drivers/leds/trigger/ledtrig-panic.c | 3 +-- > drivers/misc/ibmasm/heartbeat.c | 16 +--- > drivers/net/ipa/ipa_smp2p.c | 5 ++--- > drivers/parisc/power.c| 4 ++-- > drivers/remoteproc/remoteproc_core.c | 6 -- > drivers/s390/char/con3215.c | 2 +- > drivers/s390/char/con3270.c | 2 +- > drivers/s390/char/sclp_con.c | 2 +- > drivers/s390/char/sclp_vt220.c| 2 +- > drivers/staging/olpc_dcon/olpc_dcon.c | 6 -- > drivers/video/fbdev/hyperv_fb.c | 4 ++-- > include/linux/panic_notifier.h| 2 +- > kernel/panic.c| 9 - > 24 files changed, 54 insertions(+), 51 deletions(-) > > diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c > index d88bdf852753..8ace0d7113b6 100644 > --- a/arch/alpha/kernel/setup.c > +++ b/arch/alpha/kernel/setup.c > @@ -472,8 +472,8 @@ setup_arch(char **cmdline_p) > } > > /* Register a call for panic conditions. */ > - atomic_notifier_chain_register(&panic_notifier_list, > - &alpha_panic_block); > + atomic_notifier_chain_register(&panic_pre_reboot_list, > + &alpha_panic_block); > > #ifndef alpha_using_srm > /* Assume that we've booted from SRM if we haven't booted from MILO. > diff --git a/arch/parisc/kernel/pdc_chassis.c > b/arch/parisc/kernel/pdc_chassis.c > index da154406d368..0fd8d87fb4f9 100644 > --- a/arch/parisc/kernel/pdc_chassis.c > +++ b/arch/parisc/kernel/pdc_chassis.c > @@ -22,7 +22,6 @@ > #include > #include > #include > -#include > #include > #include > #include > @@ -135,7 +134,7 @@ void
Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On 4/27/22 5:49 PM, Guilherme G. Piccoli wrote: This patch renames the panic_notifier_list to panic_pre_reboot_list; the idea is that a subsequent patch will refactor the panic path in order to better split the notifiers, running some of them very early, some of them not so early [but still before kmsg_dump()] and finally, the rest should execute late, after kdump. The latter ones are now in the panic pre-reboot list - the name comes from the idea that these notifiers execute before panic() attempts rebooting the machine (if that option is set). We also took the opportunity to clean-up useless header inclusions, improve some notifier block declarations (e.g. in ibmasm/heartbeat.c) and more important, change some priorities - we hereby set 2 notifiers to run late in the list [iss_panic_event() and the IPMI panic_event()] due to the risks they offer (may not return, for example). Proper documentation is going to be provided in a subsequent patch, that effectively refactors the panic path. Cc: Alex Elder For "drivers/net/ipa/ipa_smp2p.c": Acked-by: Alex Elder Cc: Alexander Gordeev Cc: Anton Ivanov Cc: Benjamin Herrenschmidt Cc: Bjorn Andersson Cc: Boris Ostrovsky Cc: Chris Zankel Cc: Christian Borntraeger Cc: Corey Minyard Cc: Dexuan Cui Cc: "H. Peter Anvin" Cc: Haiyang Zhang Cc: Heiko Carstens Cc: Helge Deller Cc: Ivan Kokshaysky Cc: "James E.J. Bottomley" Cc: James Morse Cc: Johannes Berg Cc: Juergen Gross Cc: "K. Y. Srinivasan" Cc: Mathieu Poirier Cc: Matt Turner Cc: Mauro Carvalho Chehab Cc: Max Filippov Cc: Michael Ellerman Cc: Paul Mackerras Cc: Pavel Machek Cc: Richard Henderson Cc: Richard Weinberger Cc: Robert Richter Cc: Stefano Stabellini Cc: Stephen Hemminger Cc: Sven Schnelle Cc: Tony Luck Cc: Vasily Gorbik Cc: Wei Liu Signed-off-by: Guilherme G. Piccoli --- . . .