Re: [PATCH] print kdump kernel loaded status in stack dump
On 01/26/18 at 01:17pm, Petr Mladek wrote: > On Fri 2018-01-26 15:37:24, Dave Young wrote: > > On 01/19/18 at 12:47pm, Dave Young wrote: > > > On 01/18/18 at 01:57pm, Steven Rostedt wrote: > > > > On Thu, 18 Jan 2018 10:02:17 -0800 > > > > Andi Kleen wrote: > > > > > > > > > Dave Young writes: > > > > > > printk("%sHardware name: %s\n", > > > > > >log_lvl, dump_stack_arch_desc_str); > > > > > > + if (kexec_crash_loaded()) > > > > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > > > > > > > Oops/warnings are getting longer and longer, often scrolling away > > > > > from the screen, and if the kernel crashes backscroll does not work > > > > > anymore, so precious information is lost. > > > > > > > > > > Can you merge it with some other line? > > > > > > > > > > Just a [KDUMP] or so somewhere should be good enough. > > > > > > > > Or perhaps we should add it as a TAINT. Not all taints are bad. > > > > > > Hmm, I also thought about this before but It sounds like not match the > > > "tainted" meaning with the assumption that it is bad :( > > > > > > Maybe it would be better to do like Andi said, but print a better word > > > than "KDUMP", eg. "Kdumpable" sounds better. If this is fine I can > > > repost the patch. > > > > I have been not available recently, sorry for late about the update, > > rethinking about this, it is looks good to use "[KDUMP]". Also for > > the tainted flags, I tried but it is not what we want since kdump kernel > > can be unloaded, this is not like "tainted" which can only be added and > > it can not be removed. > > > > How about below version? > > --- > > > > It is useful to print kdump kernel loaded status in dump_stack() > > especially when panic happens so that we can differentiate > > kdump kernel early hang and a normal panic in a bug report. > > > > Signed-off-by: Dave Young > > --- > > [v1 -> v2] merge the status in other line as Andi Kleen suggested > > kernel/printk/printk.c |3 +++ > > --- linux.orig/kernel/printk/printk.c > > +++ linux/kernel/printk/printk.c > > @@ -48,6 +48,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -3118,9 +3119,11 @@ void __init dump_stack_set_arch_desc(con > > */ > > void dump_stack_print_info(const char *log_lvl) > > { > > - printk("%sCPU: %d PID: %d Comm: %.20s %s %s %.*s\n", > > + printk("%sCPU: %d PID: %d Comm: %.20s %s %s %s %.*s\n", > >log_lvl, raw_smp_processor_id(), current->pid, current->comm, > > - print_tainted(), init_utsname()->release, > > + print_tainted(), > > + kexec_crash_loaded() ? "[KDUMP]" : "", > > I am afraid that people might be confused what that exactly means. > For example, if I would wonder if it was already running the crashdump > kernel. > > And two small nits. It always prints the extra space. It does not fit > the style of the line. > > What about the following? > > printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s\n", > log_lvl, raw_smp_processor_id(), current->pid, current->comm, > kexec_crash_loaded() ? "Kdump: loaded " : "", > print_tainted(), > init_utsname()->release, > (int)strcspn(init_utsname()->version, " "), > init_utsname()->version); > > It would produce something like: > > CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 4.14.0-4-default+ #670 That should be a better version, thanks for catching the extra whitespace. Let me do a test and send it out as V2 separately. > > > Best Regards, > Petr
Re: [PATCH] print kdump kernel loaded status in stack dump
On Fri 2018-01-26 15:37:24, Dave Young wrote: > On 01/19/18 at 12:47pm, Dave Young wrote: > > On 01/18/18 at 01:57pm, Steven Rostedt wrote: > > > On Thu, 18 Jan 2018 10:02:17 -0800 > > > Andi Kleen wrote: > > > > > > > Dave Young writes: > > > > > printk("%sHardware name: %s\n", > > > > > log_lvl, dump_stack_arch_desc_str); > > > > > + if (kexec_crash_loaded()) > > > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > > > > > Oops/warnings are getting longer and longer, often scrolling away > > > > from the screen, and if the kernel crashes backscroll does not work > > > > anymore, so precious information is lost. > > > > > > > > Can you merge it with some other line? > > > > > > > > Just a [KDUMP] or so somewhere should be good enough. > > > > > > Or perhaps we should add it as a TAINT. Not all taints are bad. > > > > Hmm, I also thought about this before but It sounds like not match the > > "tainted" meaning with the assumption that it is bad :( > > > > Maybe it would be better to do like Andi said, but print a better word > > than "KDUMP", eg. "Kdumpable" sounds better. If this is fine I can > > repost the patch. > > I have been not available recently, sorry for late about the update, > rethinking about this, it is looks good to use "[KDUMP]". Also for > the tainted flags, I tried but it is not what we want since kdump kernel > can be unloaded, this is not like "tainted" which can only be added and > it can not be removed. > > How about below version? > --- > > It is useful to print kdump kernel loaded status in dump_stack() > especially when panic happens so that we can differentiate > kdump kernel early hang and a normal panic in a bug report. > > Signed-off-by: Dave Young > --- > [v1 -> v2] merge the status in other line as Andi Kleen suggested > kernel/printk/printk.c |3 +++ > --- linux.orig/kernel/printk/printk.c > +++ linux/kernel/printk/printk.c > @@ -48,6 +48,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -3118,9 +3119,11 @@ void __init dump_stack_set_arch_desc(con > */ > void dump_stack_print_info(const char *log_lvl) > { > - printk("%sCPU: %d PID: %d Comm: %.20s %s %s %.*s\n", > + printk("%sCPU: %d PID: %d Comm: %.20s %s %s %s %.*s\n", > log_lvl, raw_smp_processor_id(), current->pid, current->comm, > -print_tainted(), init_utsname()->release, > +print_tainted(), > +kexec_crash_loaded() ? "[KDUMP]" : "", I am afraid that people might be confused what that exactly means. For example, if I would wonder if it was already running the crashdump kernel. And two small nits. It always prints the extra space. It does not fit the style of the line. What about the following? printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s\n", log_lvl, raw_smp_processor_id(), current->pid, current->comm, kexec_crash_loaded() ? "Kdump: loaded " : "", print_tainted(), init_utsname()->release, (int)strcspn(init_utsname()->version, " "), init_utsname()->version); It would produce something like: CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 4.14.0-4-default+ #670 Best Regards, Petr
Re: [PATCH] print kdump kernel loaded status in stack dump
On 01/19/18 at 12:47pm, Dave Young wrote: > On 01/18/18 at 01:57pm, Steven Rostedt wrote: > > On Thu, 18 Jan 2018 10:02:17 -0800 > > Andi Kleen wrote: > > > > > Dave Young writes: > > > > printk("%sHardware name: %s\n", > > > >log_lvl, dump_stack_arch_desc_str); > > > > + if (kexec_crash_loaded()) > > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > > > Oops/warnings are getting longer and longer, often scrolling away > > > from the screen, and if the kernel crashes backscroll does not work > > > anymore, so precious information is lost. > > > > > > Can you merge it with some other line? > > > > > > Just a [KDUMP] or so somewhere should be good enough. > > > > Or perhaps we should add it as a TAINT. Not all taints are bad. > > Hmm, I also thought about this before but It sounds like not match the > "tainted" meaning with the assumption that it is bad :( > > Maybe it would be better to do like Andi said, but print a better word > than "KDUMP", eg. "Kdumpable" sounds better. If this is fine I can > repost the patch. I have been not available recently, sorry for late about the update, rethinking about this, it is looks good to use "[KDUMP]". Also for the tainted flags, I tried but it is not what we want since kdump kernel can be unloaded, this is not like "tainted" which can only be added and it can not be removed. How about below version? --- It is useful to print kdump kernel loaded status in dump_stack() especially when panic happens so that we can differentiate kdump kernel early hang and a normal panic in a bug report. Signed-off-by: Dave Young --- [v1 -> v2] merge the status in other line as Andi Kleen suggested kernel/printk/printk.c |3 +++ --- linux.orig/kernel/printk/printk.c +++ linux/kernel/printk/printk.c @@ -48,6 +48,7 @@ #include #include #include +#include #include #include @@ -3118,9 +3119,11 @@ void __init dump_stack_set_arch_desc(con */ void dump_stack_print_info(const char *log_lvl) { - printk("%sCPU: %d PID: %d Comm: %.20s %s %s %.*s\n", + printk("%sCPU: %d PID: %d Comm: %.20s %s %s %s %.*s\n", log_lvl, raw_smp_processor_id(), current->pid, current->comm, - print_tainted(), init_utsname()->release, + print_tainted(), + kexec_crash_loaded() ? "[KDUMP]" : "", + init_utsname()->release, (int)strcspn(init_utsname()->version, " "), init_utsname()->version);
Re: [PATCH] print kdump kernel loaded status in stack dump
On Fri, Jan 19, 2018 at 02:45:38PM +0900, Sergey Senozhatsky wrote: > On (01/18/18 10:02), Andi Kleen wrote: > > Dave Young writes: > > > printk("%sHardware name: %s\n", > > > log_lvl, dump_stack_arch_desc_str); > > > + if (kexec_crash_loaded()) > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > Oops/warnings are getting longer and longer, often scrolling away > > from the screen, and if the kernel crashes backscroll does not work > > anymore, so precious information is lost. > > true. I even ended up having a console_reflush_on_panic() function. it > simply re-prints with a delay [so I can at least read the oops] logbuf > entries every once in a while, staring with the first oops_in_progress > record. It would be better to make scrollback work even after panic (e.g. with a polled keyboard driver) -Andi
Re: [PATCH] print kdump kernel loaded status in stack dump
On (01/19/18 16:42), Dave Young wrote: [..] > > [I'm not entirely sure I see why do we have printk_delay() in > >vprintk_emit()... I mean I probably can see some reasoning behind > >it, but at the same it makes sense to slow down console_unlock() > >as well] > > Looks like I am the guy who added the code :) LOL :) > Actually no special reason, just did not thinking about the performance > issue at all at that time.. it's quite reasonable to have it in vprintk_emit(), actually. the thing is that it limits the rate at which new messages appear in the kernel log buffer, which does not necessarily correspond to the rate at which new messages appear on the consoles. printk has a "direct path" printk -> console_unlock() -> consoles, and "indirect path" - when it fails to acquire console semaphore, because it's locked by someone else. that someone else, a console_sem owner, might be scheduled out under console_sem; all printks in the meantime will just log_store() messages [after printk_delay()]. once the console_sem owner will be back to running it will resume console_unlock() printing loop and print out all pending messages immediately [modulo preemption]. so there are ways for messages to bypass printk_delay() and appear on the consoles with no visible delay. additionally printk_delay() does touch_nmi_watchdog() so we, technically, are fine with moving it to console_unlock(). -ss
Re: [PATCH] print kdump kernel loaded status in stack dump
On 01/19/18 at 05:28pm, Sergey Senozhatsky wrote: > On (01/19/18 16:16), Dave Young wrote: > > On 01/19/18 at 02:45pm, Sergey Senozhatsky wrote: > > > On (01/18/18 10:02), Andi Kleen wrote: > > > > Dave Young writes: > > > > > printk("%sHardware name: %s\n", > > > > > log_lvl, dump_stack_arch_desc_str); > > > > > + if (kexec_crash_loaded()) > > > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > > > > > Oops/warnings are getting longer and longer, often scrolling away > > > > from the screen, and if the kernel crashes backscroll does not work > > > > anymore, so precious information is lost. > > > > > > true. I even ended up having a console_reflush_on_panic() function. it > > > simply re-prints with a delay [so I can at least read the oops] logbuf > > > entries every once in a while, staring with the first oops_in_progress > > > record. > > > > > > > If too many messages printed on screen, then the next flush will > > still scroll up. > > right. but it re-prints Oops with a new console_unlock_delay() delay > which gives me enough time to either read it as many times as I want, > or take a picture, etc. it's not as fast as the normal oops print out. > > [I'm not entirely sure I see why do we have printk_delay() in >vprintk_emit()... I mean I probably can see some reasoning behind >it, but at the same it makes sense to slow down console_unlock() >as well] Looks like I am the guy who added the code :) Actually no special reason, just did not thinking about the performance issue at all at that time.. > > -ss
Re: [PATCH] print kdump kernel loaded status in stack dump
On (01/19/18 16:16), Dave Young wrote: > I thought about adding an option to improve printk_delay so it can > delay every n lines, eg. 25 rows. Maybe that idea works for this > issue? /* sent the message too soon */ printk_delay() has no effect there. it limits only printk()->vprintk_emit() and we don't have any new printk()-s once the system has panic-ed. it's console_unlock() that prints out the oops. -ss
Re: [PATCH] print kdump kernel loaded status in stack dump
On (01/19/18 16:16), Dave Young wrote: > On 01/19/18 at 02:45pm, Sergey Senozhatsky wrote: > > On (01/18/18 10:02), Andi Kleen wrote: > > > Dave Young writes: > > > > printk("%sHardware name: %s\n", > > > >log_lvl, dump_stack_arch_desc_str); > > > > + if (kexec_crash_loaded()) > > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > > > Oops/warnings are getting longer and longer, often scrolling away > > > from the screen, and if the kernel crashes backscroll does not work > > > anymore, so precious information is lost. > > > > true. I even ended up having a console_reflush_on_panic() function. it > > simply re-prints with a delay [so I can at least read the oops] logbuf > > entries every once in a while, staring with the first oops_in_progress > > record. > > > > If too many messages printed on screen, then the next flush will > still scroll up. right. but it re-prints Oops with a new console_unlock_delay() delay which gives me enough time to either read it as many times as I want, or take a picture, etc. it's not as fast as the normal oops print out. [I'm not entirely sure I see why do we have printk_delay() in vprintk_emit()... I mean I probably can see some reasoning behind it, but at the same it makes sense to slow down console_unlock() as well] -ss
Re: [PATCH] print kdump kernel loaded status in stack dump
On 01/19/18 at 02:45pm, Sergey Senozhatsky wrote: > On (01/18/18 10:02), Andi Kleen wrote: > > Dave Young writes: > > > printk("%sHardware name: %s\n", > > > log_lvl, dump_stack_arch_desc_str); > > > + if (kexec_crash_loaded()) > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > Oops/warnings are getting longer and longer, often scrolling away > > from the screen, and if the kernel crashes backscroll does not work > > anymore, so precious information is lost. > > true. I even ended up having a console_reflush_on_panic() function. it > simply re-prints with a delay [so I can at least read the oops] logbuf > entries every once in a while, staring with the first oops_in_progress > record. > If too many messages printed on screen, then the next flush will still scroll up. I thought about adding an option to improve printk_delay so it can delay every n lines, eg. 25 rows. Maybe that idea works for this issue? BTW, if kdump is used then we should avoid adding extra stuff in panic path. > something like below [it's completely hacked up, but at least gives > an idea] > > --- > > include/linux/console.h | 1 + > kernel/panic.c | 7 +++ > kernel/printk/printk.c | 39 ++- > 3 files changed, 46 insertions(+), 1 deletion(-) > > diff --git a/include/linux/console.h b/include/linux/console.h > index b8920a031a3e..502e3f539448 100644 > --- a/include/linux/console.h > +++ b/include/linux/console.h > @@ -168,6 +168,7 @@ extern void console_unlock(void); > extern void console_conditional_schedule(void); > extern void console_unblank(void); > extern void console_flush_on_panic(void); > +extern void console_reflush_on_panic(void); > extern struct tty_driver *console_device(int *); > extern void console_stop(struct console *); > extern void console_start(struct console *); > diff --git a/kernel/panic.c b/kernel/panic.c > index 2cfef408fec9..39cd59bbfaab 100644 > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -137,6 +137,7 @@ void panic(const char *fmt, ...) > va_list args; > long i, i_next = 0; > int state = 0; > + int reflush_tick = 0; > int old_cpu, this_cpu; > bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers; > > @@ -298,6 +299,12 @@ void panic(const char *fmt, ...) > i_next = i + 3600 / PANIC_BLINK_SPD; > } > mdelay(PANIC_TIMER_STEP); > + > + reflush_tick++; > + if (reflush_tick == 32) { /* don't reflush too often */ > + console_reflush_on_panic(); > + reflush_tick = 0; > + } > } > } > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 9cb943c90d98..ef3f28d4c741 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -426,6 +426,10 @@ static u32 log_next_idx; > static u64 console_seq; > static u32 console_idx; > > +/* index and sequence number of the record which started the oops print out > */ > +static u64 log_oops_seq; > +static u32 log_oops_idx; > + > /* the next printk record to read after the last 'clear' command */ > static u64 clear_seq; > static u32 clear_idx; > @@ -1736,6 +1740,15 @@ static inline void printk_delay(void) > } > } > > +/* > + * Why do we have printk_delay() in vprintk_emit() > + * and not in console_unlock()? > + */ > +static inline void console_unlock_delay(void) > +{ > + printk_delay(); > +} > + > /* > * Continuation lines are buffered, and not committed to the record buffer > * until the line is complete, or a race forces it. The line fragments > @@ -1849,6 +1862,7 @@ asmlinkage int vprintk_emit(int facility, int level, > > /* This stops the holder of console_sem just where we want him */ > logbuf_lock_irqsave(flags); > + > /* >* The printf needs to come first; we need the syslog >* prefix which might be passed-in as a parameter. > @@ -1890,7 +1904,11 @@ asmlinkage int vprintk_emit(int facility, int level, > lflags |= LOG_PREFIX|LOG_NEWLINE; > > printed_len = log_output(facility, level, lflags, dict, dictlen, text, > text_len); > - > + /* Oops... */ > + if (oops_in_progress && !log_oops_seq) { > + log_oops_seq = log_next_seq; > + log_oops_idx = log_next_idx; > + } > logbuf_unlock_irqrestore(flags); > > /* If called from the scheduler, we can not call up(). */ > @@ -2396,6 +2414,7 @@ void console_unlock(void) > > stop_critical_timings();/* don't trace print latency */ > call_console_drivers(ext_text, ext_len, text, len); > + console_unlock_delay(); > start_critical_timings(); > > if (console_lock_spinning_disable_and_check()) { > @@ -2495,6 +2514,24 @@ void console_flush_on_panic(void) > console_unlock(); > } > > +/** > + * console_reflush_on_pan
Re: [PATCH] print kdump kernel loaded status in stack dump
On (01/18/18 10:02), Andi Kleen wrote: > Dave Young writes: > > printk("%sHardware name: %s\n", > >log_lvl, dump_stack_arch_desc_str); > > + if (kexec_crash_loaded()) > > + printk("%skdump kernel loaded\n", log_lvl); > > Oops/warnings are getting longer and longer, often scrolling away > from the screen, and if the kernel crashes backscroll does not work > anymore, so precious information is lost. true. I even ended up having a console_reflush_on_panic() function. it simply re-prints with a delay [so I can at least read the oops] logbuf entries every once in a while, staring with the first oops_in_progress record. something like below [it's completely hacked up, but at least gives an idea] --- include/linux/console.h | 1 + kernel/panic.c | 7 +++ kernel/printk/printk.c | 39 ++- 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/include/linux/console.h b/include/linux/console.h index b8920a031a3e..502e3f539448 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -168,6 +168,7 @@ extern void console_unlock(void); extern void console_conditional_schedule(void); extern void console_unblank(void); extern void console_flush_on_panic(void); +extern void console_reflush_on_panic(void); extern struct tty_driver *console_device(int *); extern void console_stop(struct console *); extern void console_start(struct console *); diff --git a/kernel/panic.c b/kernel/panic.c index 2cfef408fec9..39cd59bbfaab 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -137,6 +137,7 @@ void panic(const char *fmt, ...) va_list args; long i, i_next = 0; int state = 0; + int reflush_tick = 0; int old_cpu, this_cpu; bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers; @@ -298,6 +299,12 @@ void panic(const char *fmt, ...) i_next = i + 3600 / PANIC_BLINK_SPD; } mdelay(PANIC_TIMER_STEP); + + reflush_tick++; + if (reflush_tick == 32) { /* don't reflush too often */ + console_reflush_on_panic(); + reflush_tick = 0; + } } } diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 9cb943c90d98..ef3f28d4c741 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -426,6 +426,10 @@ static u32 log_next_idx; static u64 console_seq; static u32 console_idx; +/* index and sequence number of the record which started the oops print out */ +static u64 log_oops_seq; +static u32 log_oops_idx; + /* the next printk record to read after the last 'clear' command */ static u64 clear_seq; static u32 clear_idx; @@ -1736,6 +1740,15 @@ static inline void printk_delay(void) } } +/* + * Why do we have printk_delay() in vprintk_emit() + * and not in console_unlock()? + */ +static inline void console_unlock_delay(void) +{ + printk_delay(); +} + /* * Continuation lines are buffered, and not committed to the record buffer * until the line is complete, or a race forces it. The line fragments @@ -1849,6 +1862,7 @@ asmlinkage int vprintk_emit(int facility, int level, /* This stops the holder of console_sem just where we want him */ logbuf_lock_irqsave(flags); + /* * The printf needs to come first; we need the syslog * prefix which might be passed-in as a parameter. @@ -1890,7 +1904,11 @@ asmlinkage int vprintk_emit(int facility, int level, lflags |= LOG_PREFIX|LOG_NEWLINE; printed_len = log_output(facility, level, lflags, dict, dictlen, text, text_len); - + /* Oops... */ + if (oops_in_progress && !log_oops_seq) { + log_oops_seq = log_next_seq; + log_oops_idx = log_next_idx; + } logbuf_unlock_irqrestore(flags); /* If called from the scheduler, we can not call up(). */ @@ -2396,6 +2414,7 @@ void console_unlock(void) stop_critical_timings();/* don't trace print latency */ call_console_drivers(ext_text, ext_len, text, len); + console_unlock_delay(); start_critical_timings(); if (console_lock_spinning_disable_and_check()) { @@ -2495,6 +2514,24 @@ void console_flush_on_panic(void) console_unlock(); } +/** + * console_reflush_on_panic - re-flush console content starting from the + * first oops_in_progress record + */ +void console_reflush_on_panic(void) +{ + unsigned long flags; + + logbuf_lock_irqsave(flags); + console_seq = log_oops_seq; + console_idx = log_oops_idx; + logbuf_unlock_irqrestore(flags); + + if (!printk_delay_msec) + printk_delay_msec = 273; /* I can't read any faster */ + console_flush_on_panic(); +} + /* * Return the console tty driver structure and its associated index */ -- 2
Re: [PATCH] print kdump kernel loaded status in stack dump
On 01/18/18 at 01:57pm, Steven Rostedt wrote: > On Thu, 18 Jan 2018 10:02:17 -0800 > Andi Kleen wrote: > > > Dave Young writes: > > > printk("%sHardware name: %s\n", > > > log_lvl, dump_stack_arch_desc_str); > > > + if (kexec_crash_loaded()) > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > Oops/warnings are getting longer and longer, often scrolling away > > from the screen, and if the kernel crashes backscroll does not work > > anymore, so precious information is lost. > > > > Can you merge it with some other line? > > > > Just a [KDUMP] or so somewhere should be good enough. > > Or perhaps we should add it as a TAINT. Not all taints are bad. Hmm, I also thought about this before but It sounds like not match the "tainted" meaning with the assumption that it is bad :( Maybe it would be better to do like Andi said, but print a better word than "KDUMP", eg. "Kdumpable" sounds better. If this is fine I can repost the patch. > > -- Steve Thanks Dave
Re: [PATCH] print kdump kernel loaded status in stack dump
On Thu, 18 Jan 2018 10:02:17 -0800 Andi Kleen wrote: > Dave Young writes: > > printk("%sHardware name: %s\n", > >log_lvl, dump_stack_arch_desc_str); > > + if (kexec_crash_loaded()) > > + printk("%skdump kernel loaded\n", log_lvl); > > Oops/warnings are getting longer and longer, often scrolling away > from the screen, and if the kernel crashes backscroll does not work > anymore, so precious information is lost. > > Can you merge it with some other line? > > Just a [KDUMP] or so somewhere should be good enough. Or perhaps we should add it as a TAINT. Not all taints are bad. -- Steve
Re: [PATCH] print kdump kernel loaded status in stack dump
Dave Young writes: > printk("%sHardware name: %s\n", > log_lvl, dump_stack_arch_desc_str); > + if (kexec_crash_loaded()) > + printk("%skdump kernel loaded\n", log_lvl); Oops/warnings are getting longer and longer, often scrolling away from the screen, and if the kernel crashes backscroll does not work anymore, so precious information is lost. Can you merge it with some other line? Just a [KDUMP] or so somewhere should be good enough. -Andi
Re: [PATCH] print kdump kernel loaded status in stack dump
On 01/17/18 at 02:42pm, Petr Mladek wrote: > On Wed 2018-01-17 20:32:44, Dave Young wrote: > > Hi, > > > > Thanks for your comments. > > On 01/17/18 at 09:57am, Petr Mladek wrote: > > > On Wed 2018-01-17 12:50:57, Dave Young wrote: > > > > It is useful to print kdump kernel loaded status in dump_stack() > > > > especially when panic happens so that we can differenciate > > > > kdump kernel early hang and a normal panic in a bug report. > > > > > > > > Signed-off-by: Dave Young > > > > --- > > > > kernel/printk/printk.c |3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > --- linux-x86.orig/kernel/printk/printk.c > > > > +++ linux-x86/kernel/printk/printk.c > > > > @@ -48,6 +48,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > > > > > #include > > > > #include > > > > @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l > > > > if (dump_stack_arch_desc_str[0] != '\0') > > > > printk("%sHardware name: %s\n", > > > >log_lvl, dump_stack_arch_desc_str); > > > > + if (kexec_crash_loaded()) > > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > > > IMHO, it would be better to do it like for the workqueues. > > > I mean to call printk_kexec_info(log_lv1, current) here > > > that would be impletemented in kexec sources. > > > Then it could be maintained by kexec people. > > > > > > Anyway, I wonder if the info about kexec_crash_loaded() is > > > enough. I am not much familiar with kexec. AFAIK, > > > the image might be loaded long time before it > > > is acutally used. > > > > kexec_crash_loaded is enough, we only care if kdump kernel being > > loaded or not, nothing else, no matter how long it has been loaded. > > In Fedora/RHEL a kdump service takes care of loading the kernel but > > it runs after networking is ready. If people want to save > > the vmcore to nfs/ssh then we need detect network and build the > > initramfs. In the nfs/ssh case if some networking code panicked it > > is possible that kdump service has not started, but sometimes bug > > can not be easily reproduced thus nobody can know if kdump is active > > or not. > > I see. > > > Since kexec_crash_loaded() is already in kexec souce code, and it > > is the only thing need to know, do you think it is really necessary > > to add a printk_kexec_info()? I can do it if you strongly suggest > > to do so. > > No, the original approach is fine if it is really that simple ;-) > > > > > > > Finally, the style of the other lines is: > > > > > > Name: details > > > > > > I would suggest to print something like: > > > > > > Kexec: details > > > > > > , where the details might be whether the image is loaded, > > > whether the loaded kernel is being executed, and > > > other kexec-related flags. > > > > Will do, it can be something like: > > Kexec: kdump kernel loaded > > Looks good to me. With this message, I could give this > patch even > > Reviewed-by: Petr Mladek > > I could update the string when pushing into printk.git. > I am just going to wait a bit for more feedback if any. Cool, thank you! > > Best Regards, > Petr Thanks Dave
Re: [PATCH] print kdump kernel loaded status in stack dump
On Wed, 17 Jan 2018 14:42:17 +0100 Petr Mladek wrote: > > kexec_crash_loaded is enough, we only care if kdump kernel being > > loaded or not, nothing else, no matter how long it has been loaded. > > In Fedora/RHEL a kdump service takes care of loading the kernel but > > it runs after networking is ready. If people want to save > > the vmcore to nfs/ssh then we need detect network and build the > > initramfs. In the nfs/ssh case if some networking code panicked it > > is possible that kdump service has not started, but sometimes bug > > can not be easily reproduced thus nobody can know if kdump is active > > or not. > > I see. > > > Since kexec_crash_loaded() is already in kexec souce code, and it > > is the only thing need to know, do you think it is really necessary > > to add a printk_kexec_info()? I can do it if you strongly suggest > > to do so. > > No, the original approach is fine if it is really that simple ;-) I have to ask. Why is dump_stack_print_info() in printk to begin with. It seems to be an odd place to put it, and not something I would think should be in the printk() subsystem anyway. Why is it not in lib/dump_stack.c? > > > > > > > Finally, the style of the other lines is: > > > > > > Name: details > > > > > > I would suggest to print something like: > > > > > > Kexec: details > > > > > > , where the details might be whether the image is loaded, > > > whether the loaded kernel is being executed, and > > > other kexec-related flags. > > > > Will do, it can be something like: > > Kexec: kdump kernel loaded > > Looks good to me. With this message, I could give this > patch even > > Reviewed-by: Petr Mladek > > I could update the string when pushing into printk.git. > I am just going to wait a bit for more feedback if any. The patch looks fine to me. Reviewed-by: Steven Rostedt (VMware) -- Steve
Re: [PATCH] print kdump kernel loaded status in stack dump
On Wed 2018-01-17 20:32:44, Dave Young wrote: > Hi, > > Thanks for your comments. > On 01/17/18 at 09:57am, Petr Mladek wrote: > > On Wed 2018-01-17 12:50:57, Dave Young wrote: > > > It is useful to print kdump kernel loaded status in dump_stack() > > > especially when panic happens so that we can differenciate > > > kdump kernel early hang and a normal panic in a bug report. > > > > > > Signed-off-by: Dave Young > > > --- > > > kernel/printk/printk.c |3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > --- linux-x86.orig/kernel/printk/printk.c > > > +++ linux-x86/kernel/printk/printk.c > > > @@ -48,6 +48,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > > > > #include > > > #include > > > @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l > > > if (dump_stack_arch_desc_str[0] != '\0') > > > printk("%sHardware name: %s\n", > > > log_lvl, dump_stack_arch_desc_str); > > > + if (kexec_crash_loaded()) > > > + printk("%skdump kernel loaded\n", log_lvl); > > > > IMHO, it would be better to do it like for the workqueues. > > I mean to call printk_kexec_info(log_lv1, current) here > > that would be impletemented in kexec sources. > > Then it could be maintained by kexec people. > > > > Anyway, I wonder if the info about kexec_crash_loaded() is > > enough. I am not much familiar with kexec. AFAIK, > > the image might be loaded long time before it > > is acutally used. > > kexec_crash_loaded is enough, we only care if kdump kernel being > loaded or not, nothing else, no matter how long it has been loaded. > In Fedora/RHEL a kdump service takes care of loading the kernel but > it runs after networking is ready. If people want to save > the vmcore to nfs/ssh then we need detect network and build the > initramfs. In the nfs/ssh case if some networking code panicked it > is possible that kdump service has not started, but sometimes bug > can not be easily reproduced thus nobody can know if kdump is active > or not. I see. > Since kexec_crash_loaded() is already in kexec souce code, and it > is the only thing need to know, do you think it is really necessary > to add a printk_kexec_info()? I can do it if you strongly suggest > to do so. No, the original approach is fine if it is really that simple ;-) > > > > Finally, the style of the other lines is: > > > > Name: details > > > > I would suggest to print something like: > > > > Kexec: details > > > > , where the details might be whether the image is loaded, > > whether the loaded kernel is being executed, and > > other kexec-related flags. > > Will do, it can be something like: > Kexec: kdump kernel loaded Looks good to me. With this message, I could give this patch even Reviewed-by: Petr Mladek I could update the string when pushing into printk.git. I am just going to wait a bit for more feedback if any. Best Regards, Petr
Re: [PATCH] print kdump kernel loaded status in stack dump
Hi, Thanks for your comments. On 01/17/18 at 09:57am, Petr Mladek wrote: > On Wed 2018-01-17 12:50:57, Dave Young wrote: > > It is useful to print kdump kernel loaded status in dump_stack() > > especially when panic happens so that we can differenciate > > kdump kernel early hang and a normal panic in a bug report. > > > > Signed-off-by: Dave Young > > --- > > kernel/printk/printk.c |3 +++ > > 1 file changed, 3 insertions(+) > > > > --- linux-x86.orig/kernel/printk/printk.c > > +++ linux-x86/kernel/printk/printk.c > > @@ -48,6 +48,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l > > if (dump_stack_arch_desc_str[0] != '\0') > > printk("%sHardware name: %s\n", > >log_lvl, dump_stack_arch_desc_str); > > + if (kexec_crash_loaded()) > > + printk("%skdump kernel loaded\n", log_lvl); > > IMHO, it would be better to do it like for the workqueues. > I mean to call printk_kexec_info(log_lv1, current) here > that would be impletemented in kexec sources. > Then it could be maintained by kexec people. > > Anyway, I wonder if the info about kexec_crash_loaded() is > enough. I am not much familiar with kexec. AFAIK, > the image might be loaded long time before it > is acutally used. kexec_crash_loaded is enough, we only care if kdump kernel being loaded or not, nothing else, no matter how long it has been loaded. In Fedora/RHEL a kdump service takes care of loading the kernel but it runs after networking is ready. If people want to save the vmcore to nfs/ssh then we need detect network and build the initramfs. In the nfs/ssh case if some networking code panicked it is possible that kdump service has not started, but sometimes bug can not be easily reproduced thus nobody can know if kdump is active or not. Since kexec_crash_loaded() is already in kexec souce code, and it is the only thing need to know, do you think it is really necessary to add a printk_kexec_info()? I can do it if you strongly suggest to do so. > > Finally, the style of the other lines is: > > Name: details > > I would suggest to print something like: > > Kexec: details > > , where the details might be whether the image is loaded, > whether the loaded kernel is being executed, and > other kexec-related flags. Will do, it can be something like: Kexec: kdump kernel loaded > > How does that sound? > > > print_worker_info(log_lvl, current); > > } > > Best Regards, > Petr Thanks Dave
Re: [PATCH] print kdump kernel loaded status in stack dump
On Wed 2018-01-17 12:50:57, Dave Young wrote: > It is useful to print kdump kernel loaded status in dump_stack() > especially when panic happens so that we can differenciate > kdump kernel early hang and a normal panic in a bug report. > > Signed-off-by: Dave Young > --- > kernel/printk/printk.c |3 +++ > 1 file changed, 3 insertions(+) > > --- linux-x86.orig/kernel/printk/printk.c > +++ linux-x86/kernel/printk/printk.c > @@ -48,6 +48,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -3127,6 +3128,8 @@ void dump_stack_print_info(const char *l > if (dump_stack_arch_desc_str[0] != '\0') > printk("%sHardware name: %s\n", > log_lvl, dump_stack_arch_desc_str); > + if (kexec_crash_loaded()) > + printk("%skdump kernel loaded\n", log_lvl); IMHO, it would be better to do it like for the workqueues. I mean to call printk_kexec_info(log_lv1, current) here that would be impletemented in kexec sources. Then it could be maintained by kexec people. Anyway, I wonder if the info about kexec_crash_loaded() is enough. I am not much familiar with kexec. AFAIK, the image might be loaded long time before it is acutally used. Finally, the style of the other lines is: Name: details I would suggest to print something like: Kexec: details , where the details might be whether the image is loaded, whether the loaded kernel is being executed, and other kexec-related flags. How does that sound? > print_worker_info(log_lvl, current); > } Best Regards, Petr