Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu 2015-10-22 14:19:26, David Howells wrote: > Petr Mladek wrote: > > > I would expect that the first few messages are printed to the console > > before the buffer is wrapped. IMHO, in many cases, you are interested > > into the final messages that describe why the system went down. > > The last message might tell you that the machine panicked because the NMI > handler triggered due to a spinlocked section taking too long or something. > This doesn't help if the oops that caused the spinlock to remain held or > whatever gets discarded from the buffer due to several intervening complaints > that result secondarily from the initial oops. There might be flood of messages if you enable debugging or so. In this case, the interesting messages would be at the end of the buffer. IMHO, you want both the beginning of the flood and the end when machine goes down. You do not want the repeated blob in the middle. Do you really miss the first few lines on the serial console, please? Did you consider the panic on Oops kernel parameter? > > If there is no time to print them, you want to have them in the crash dump > > (ring buffer) at least. > > But not at the expense of discarding the first oops report. *That* one is the > most important. But your patch discards the initial messages as well once they are printed. > Perhaps things could be arranged such that messages *can* be discarded from > the front of the buffer *provided* they are not oops messages. Another possibility would be to decide this by the importance level of the message. For example, we might start ignoring less important messages when the buffer is getting full. But this should be optional. It makes sense only when someone print/logs the messages somewhere. Anyway, be warned that any additional complexity must have very good reasons. The printk code already is bloated and there is a strong resistance to make it worse. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu 2015-10-22 14:19:26, David Howells wrote: > Petr Mladekwrote: > > > I would expect that the first few messages are printed to the console > > before the buffer is wrapped. IMHO, in many cases, you are interested > > into the final messages that describe why the system went down. > > The last message might tell you that the machine panicked because the NMI > handler triggered due to a spinlocked section taking too long or something. > This doesn't help if the oops that caused the spinlock to remain held or > whatever gets discarded from the buffer due to several intervening complaints > that result secondarily from the initial oops. There might be flood of messages if you enable debugging or so. In this case, the interesting messages would be at the end of the buffer. IMHO, you want both the beginning of the flood and the end when machine goes down. You do not want the repeated blob in the middle. Do you really miss the first few lines on the serial console, please? Did you consider the panic on Oops kernel parameter? > > If there is no time to print them, you want to have them in the crash dump > > (ring buffer) at least. > > But not at the expense of discarding the first oops report. *That* one is the > most important. But your patch discards the initial messages as well once they are printed. > Perhaps things could be arranged such that messages *can* be discarded from > the front of the buffer *provided* they are not oops messages. Another possibility would be to decide this by the importance level of the message. For example, we might start ignoring less important messages when the buffer is getting full. But this should be optional. It makes sense only when someone print/logs the messages somewhere. Anyway, be warned that any additional complexity must have very good reasons. The printk code already is bloated and there is a strong resistance to make it worse. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu, 2015-10-22 at 14:19 +0100, David Howells wrote: > Petr Mladek wrote: > > > I would expect that the first few messages are printed to the console > > before the buffer is wrapped. IMHO, in many cases, you are interested > > into the final messages that describe why the system went down. > > The last message might tell you that the machine panicked because the NMI > handler triggered due to a spinlocked section taking too long or something. > This doesn't help if the oops that caused the spinlock to remain held or > whatever gets discarded from the buffer due to several intervening complaints > that result secondarily from the initial oops. > > > If there is no time to print them, you want to have them in the crash dump > > (ring buffer) at least. > > But not at the expense of discarding the first oops report. *That* one is the > most important. > > Perhaps things could be arranged such that messages *can* be discarded from > the front of the buffer *provided* they are not oops messages. Of course, all this is a symptom of the fact that printk is now lossy. If you had an option to make it reliable again, you wouldn't need to argue over *which* messages to lose. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
Petr Mladek wrote: > I would expect that the first few messages are printed to the console > before the buffer is wrapped. IMHO, in many cases, you are interested > into the final messages that describe why the system went down. The last message might tell you that the machine panicked because the NMI handler triggered due to a spinlocked section taking too long or something. This doesn't help if the oops that caused the spinlock to remain held or whatever gets discarded from the buffer due to several intervening complaints that result secondarily from the initial oops. > If there is no time to print them, you want to have them in the crash dump > (ring buffer) at least. But not at the expense of discarding the first oops report. *That* one is the most important. Perhaps things could be arranged such that messages *can* be discarded from the front of the buffer *provided* they are not oops messages. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu, 2015-10-22 at 14:18 +0200, Jan Kara wrote: > > > > > What happened to it? And how do we fix it? > > Hard to fix since you'd easily get RCU stalls and softlockup messages on > systems with lots of CPUs and heavy printk traffic... If we have to explicitly opt in to synchronous output, then such things should be tolerable. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
Added Andrew into CC who maintains printk() code. On Thu 2015-10-22 11:16:50, David Howells wrote: > printk() currently discards earlier messages to make space for new messages > arriving. This has the distinct downside that if the kernel starts > churning out messages because of some initial incident, the report of the > initial incident is likely to be lost under a blizzard of: > > ** NNN printk messages dropped ** > > messages from console_unlock(). > > The first message generated (typically an oops) is usually the most > important - the one you want to solve first - so we really want to see > that. oops=panic or panic_on_warn kernel parameters might be more useful in this situation. I would expect that the first few messages are printed to the console before the buffer is wrapped. IMHO, in many cases, you are interested into the final messages that describe why the system went down. If there is no time to print them, you want to have them in the crash dump (ring buffer) at least. > To this end, change log_store() to only write a message into the buffer if > there is sufficient space to hold that message. The message may be > truncated if it will then fit. > > This patch could be improved by noting that some messages got discarded > when next there is space to do so. > > Signed-off-by: David Howells > --- [...] > + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= > wsize) > + goto have_space_no_wrap; > + > + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) >= size) > + goto have_space_wrap; > + > + /* Try to truncate the message. */ > + size = truncate_msg(_len, _msg_len, _len, _len); > + wsize = size + sizeof(struct printk_log); truncate_msg() currently works only for messages that are bigger than half of the buffer. Also if the flood of messages is faster than printing, you might end up with the buffer full of "" strings with a minimum of useful information. > + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= > wsize) > + goto have_space_no_wrap; > + > + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) < size) > + return 0; Please, try to avoid copy code. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu 22-10-15 11:28:47, David Woodhouse wrote: > On Thu, 2015-10-22 at 11:16 +0100, David Howells wrote: > > printk() currently discards earlier messages to make space for new messages > > arriving. This has the distinct downside that if the kernel starts > > churning out messages because of some initial incident, the report of the > > initial incident is likely to be lost under a blizzard of: > > > > ** NNN printk messages dropped ** > > > > messages from console_unlock(). > > > > The first message generated (typically an oops) is usually the most > > important - the one you want to solve first - so we really want to see > > that. > > But wait... didn't I watch you muttering on IRC about the actual bug > you were trying to catch here... and didn't you have a *serial* console > hooked up? > > What broke such that serial console stopped giving you *every* message? > > Serial console was always *synchronous*. Not for last 10 years or even more... When there is only a single CPU calling printk, you are right. But when two CPUs happen to enter printk code, the first will do the printing and the second printk will return immediately after appending message to the kernel log buffer and the message will appear on console when the first CPU gets to it. > We could do stuff like... > >printk("Going to do foo...\n"); >outb(foo, baz); >printk("Did foo and the machine didn't catch fire! Now bar\n"); >outb(bar, baz); >printk("Done\n"); > > And with a serial console I could know *precisely* the point at which > the machine locked up. > > And I could enable the silly debugging levels on things like JFFS2 and > be sure that with a serial console I could catch *every* printk > reliably — which led to a number of cases where people would reproduce > a bug with a serial console and debugging, mail me a huge log file, and > get a patch back in reply. > > We *need* to have a mode where serial console is actually *reliable*, > and we can know that the message has been sent out the port before the > printk() call returns. > > What happened to it? And how do we fix it? Hard to fix since you'd easily get RCU stalls and softlockup messages on systems with lots of CPUs and heavy printk traffic... Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu, 2015-10-22 at 11:16 +0100, David Howells wrote: > printk() currently discards earlier messages to make space for new messages > arriving. This has the distinct downside that if the kernel starts > churning out messages because of some initial incident, the report of the > initial incident is likely to be lost under a blizzard of: > > ** NNN printk messages dropped ** > > messages from console_unlock(). > > The first message generated (typically an oops) is usually the most > important - the one you want to solve first - so we really want to see > that. But wait... didn't I watch you muttering on IRC about the actual bug you were trying to catch here... and didn't you have a *serial* console hooked up? What broke such that serial console stopped giving you *every* message? Serial console was always *synchronous*. We could do stuff like... printk("Going to do foo...\n"); outb(foo, baz); printk("Did foo and the machine didn't catch fire! Now bar\n"); outb(bar, baz); printk("Done\n"); And with a serial console I could know *precisely* the point at which the machine locked up. And I could enable the silly debugging levels on things like JFFS2 and be sure that with a serial console I could catch *every* printk reliably — which led to a number of cases where people would reproduce a bug with a serial console and debugging, mail me a huge log file, and get a patch back in reply. We *need* to have a mode where serial console is actually *reliable*, and we can know that the message has been sent out the port before the printk() call returns. What happened to it? And how do we fix it? -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
[PATCH] printk: Don't discard earlier unprinted messages to make space
printk() currently discards earlier messages to make space for new messages arriving. This has the distinct downside that if the kernel starts churning out messages because of some initial incident, the report of the initial incident is likely to be lost under a blizzard of: ** NNN printk messages dropped ** messages from console_unlock(). The first message generated (typically an oops) is usually the most important - the one you want to solve first - so we really want to see that. To this end, change log_store() to only write a message into the buffer if there is sufficient space to hold that message. The message may be truncated if it will then fit. This patch could be improved by noting that some messages got discarded when next there is space to do so. Signed-off-by: David Howells --- kernel/printk/printk.c | 96 ++-- 1 file changed, 35 insertions(+), 61 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 8f0324ef72ab..c2099ede0bc8 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -46,6 +46,7 @@ #include #include #include +#include #include @@ -336,48 +337,6 @@ static u32 log_next(u32 idx) return idx + msg->len; } -/* - * Check whether there is enough free space for the given message. - * - * The same values of first_idx and next_idx mean that the buffer - * is either empty or full. - * - * If the buffer is empty, we must respect the position of the indexes. - * They cannot be reset to the beginning of the buffer. - */ -static int logbuf_has_space(u32 msg_size, bool empty) -{ - u32 free; - - if (log_next_idx > log_first_idx || empty) - free = max(log_buf_len - log_next_idx, log_first_idx); - else - free = log_first_idx - log_next_idx; - - /* -* We need space also for an empty header that signalizes wrapping -* of the buffer. -*/ - return free >= msg_size + sizeof(struct printk_log); -} - -static int log_make_free_space(u32 msg_size) -{ - while (log_first_seq < log_next_seq) { - if (logbuf_has_space(msg_size, false)) - return 0; - /* drop old messages until we have enough contiguous space */ - log_first_idx = log_next(log_first_idx); - log_first_seq++; - } - - /* sequence numbers are equal, so the log buffer is empty */ - if (logbuf_has_space(msg_size, true)) - return 0; - - return -ENOMEM; -} - /* compute the message size including the padding bytes */ static u32 msg_used_size(u16 text_len, u16 dict_len, u32 *pad_len) { @@ -423,32 +382,47 @@ static int log_store(int facility, int level, const char *text, u16 text_len) { struct printk_log *msg; - u32 size, pad_len; + u32 size, wsize, pad_len; u16 trunc_msg_len = 0; /* number of '\0' padding bytes to next message */ size = msg_used_size(text_len, dict_len, _len); - if (log_make_free_space(size)) { - /* truncate the message if it is too long for empty buffer */ - size = truncate_msg(_len, _msg_len, - _len, _len); - /* survive when the log buffer is too small for trunc_msg */ - if (log_make_free_space(size)) - return 0; - } + /* See if we have sufficient space to insert a message. Note we also +* need to keep a gap right at the end of the buffer to insert a 'wrap +* here' marker. +*/ + wsize = size + sizeof(struct printk_log); - if (log_next_idx + size + sizeof(struct printk_log) > log_buf_len) { - /* -* This message + an additional empty header does not fit -* at the end of the buffer. Add an empty header with len == 0 -* to signify a wrap around. -*/ - memset(log_buf + log_next_idx, 0, sizeof(struct printk_log)); - log_next_idx = 0; - } + if (log_first_seq != log_next_seq && log_first_idx == log_next_idx) + return 0; /* Buffer is completely full */ + + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= wsize) + goto have_space_no_wrap; + + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) >= size) + goto have_space_wrap; + + /* Try to truncate the message. */ + size = truncate_msg(_len, _msg_len, _len, _len); + wsize = size + sizeof(struct printk_log); + + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= wsize) + goto have_space_no_wrap; + + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) < size) + return 0; + +have_space_wrap: + /* This message plus an additional empty header does not fit at the end
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu, 2015-10-22 at 11:16 +0100, David Howells wrote: > printk() currently discards earlier messages to make space for new messages > arriving. This has the distinct downside that if the kernel starts > churning out messages because of some initial incident, the report of the > initial incident is likely to be lost under a blizzard of: > > ** NNN printk messages dropped ** > > messages from console_unlock(). > > The first message generated (typically an oops) is usually the most > important - the one you want to solve first - so we really want to see > that. But wait... didn't I watch you muttering on IRC about the actual bug you were trying to catch here... and didn't you have a *serial* console hooked up? What broke such that serial console stopped giving you *every* message? Serial console was always *synchronous*. We could do stuff like... printk("Going to do foo...\n"); outb(foo, baz); printk("Did foo and the machine didn't catch fire! Now bar\n"); outb(bar, baz); printk("Done\n"); And with a serial console I could know *precisely* the point at which the machine locked up. And I could enable the silly debugging levels on things like JFFS2 and be sure that with a serial console I could catch *every* printk reliably — which led to a number of cases where people would reproduce a bug with a serial console and debugging, mail me a huge log file, and get a patch back in reply. We *need* to have a mode where serial console is actually *reliable*, and we can know that the message has been sent out the port before the printk() call returns. What happened to it? And how do we fix it? -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu 22-10-15 11:28:47, David Woodhouse wrote: > On Thu, 2015-10-22 at 11:16 +0100, David Howells wrote: > > printk() currently discards earlier messages to make space for new messages > > arriving. This has the distinct downside that if the kernel starts > > churning out messages because of some initial incident, the report of the > > initial incident is likely to be lost under a blizzard of: > > > > ** NNN printk messages dropped ** > > > > messages from console_unlock(). > > > > The first message generated (typically an oops) is usually the most > > important - the one you want to solve first - so we really want to see > > that. > > But wait... didn't I watch you muttering on IRC about the actual bug > you were trying to catch here... and didn't you have a *serial* console > hooked up? > > What broke such that serial console stopped giving you *every* message? > > Serial console was always *synchronous*. Not for last 10 years or even more... When there is only a single CPU calling printk, you are right. But when two CPUs happen to enter printk code, the first will do the printing and the second printk will return immediately after appending message to the kernel log buffer and the message will appear on console when the first CPU gets to it. > We could do stuff like... > >printk("Going to do foo...\n"); >outb(foo, baz); >printk("Did foo and the machine didn't catch fire! Now bar\n"); >outb(bar, baz); >printk("Done\n"); > > And with a serial console I could know *precisely* the point at which > the machine locked up. > > And I could enable the silly debugging levels on things like JFFS2 and > be sure that with a serial console I could catch *every* printk > reliably — which led to a number of cases where people would reproduce > a bug with a serial console and debugging, mail me a huge log file, and > get a patch back in reply. > > We *need* to have a mode where serial console is actually *reliable*, > and we can know that the message has been sent out the port before the > printk() call returns. > > What happened to it? And how do we fix it? Hard to fix since you'd easily get RCU stalls and softlockup messages on systems with lots of CPUs and heavy printk traffic... Honza -- Jan KaraSUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] printk: Don't discard earlier unprinted messages to make space
printk() currently discards earlier messages to make space for new messages arriving. This has the distinct downside that if the kernel starts churning out messages because of some initial incident, the report of the initial incident is likely to be lost under a blizzard of: ** NNN printk messages dropped ** messages from console_unlock(). The first message generated (typically an oops) is usually the most important - the one you want to solve first - so we really want to see that. To this end, change log_store() to only write a message into the buffer if there is sufficient space to hold that message. The message may be truncated if it will then fit. This patch could be improved by noting that some messages got discarded when next there is space to do so. Signed-off-by: David Howells--- kernel/printk/printk.c | 96 ++-- 1 file changed, 35 insertions(+), 61 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 8f0324ef72ab..c2099ede0bc8 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -46,6 +46,7 @@ #include #include #include +#include #include @@ -336,48 +337,6 @@ static u32 log_next(u32 idx) return idx + msg->len; } -/* - * Check whether there is enough free space for the given message. - * - * The same values of first_idx and next_idx mean that the buffer - * is either empty or full. - * - * If the buffer is empty, we must respect the position of the indexes. - * They cannot be reset to the beginning of the buffer. - */ -static int logbuf_has_space(u32 msg_size, bool empty) -{ - u32 free; - - if (log_next_idx > log_first_idx || empty) - free = max(log_buf_len - log_next_idx, log_first_idx); - else - free = log_first_idx - log_next_idx; - - /* -* We need space also for an empty header that signalizes wrapping -* of the buffer. -*/ - return free >= msg_size + sizeof(struct printk_log); -} - -static int log_make_free_space(u32 msg_size) -{ - while (log_first_seq < log_next_seq) { - if (logbuf_has_space(msg_size, false)) - return 0; - /* drop old messages until we have enough contiguous space */ - log_first_idx = log_next(log_first_idx); - log_first_seq++; - } - - /* sequence numbers are equal, so the log buffer is empty */ - if (logbuf_has_space(msg_size, true)) - return 0; - - return -ENOMEM; -} - /* compute the message size including the padding bytes */ static u32 msg_used_size(u16 text_len, u16 dict_len, u32 *pad_len) { @@ -423,32 +382,47 @@ static int log_store(int facility, int level, const char *text, u16 text_len) { struct printk_log *msg; - u32 size, pad_len; + u32 size, wsize, pad_len; u16 trunc_msg_len = 0; /* number of '\0' padding bytes to next message */ size = msg_used_size(text_len, dict_len, _len); - if (log_make_free_space(size)) { - /* truncate the message if it is too long for empty buffer */ - size = truncate_msg(_len, _msg_len, - _len, _len); - /* survive when the log buffer is too small for trunc_msg */ - if (log_make_free_space(size)) - return 0; - } + /* See if we have sufficient space to insert a message. Note we also +* need to keep a gap right at the end of the buffer to insert a 'wrap +* here' marker. +*/ + wsize = size + sizeof(struct printk_log); - if (log_next_idx + size + sizeof(struct printk_log) > log_buf_len) { - /* -* This message + an additional empty header does not fit -* at the end of the buffer. Add an empty header with len == 0 -* to signify a wrap around. -*/ - memset(log_buf + log_next_idx, 0, sizeof(struct printk_log)); - log_next_idx = 0; - } + if (log_first_seq != log_next_seq && log_first_idx == log_next_idx) + return 0; /* Buffer is completely full */ + + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= wsize) + goto have_space_no_wrap; + + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) >= size) + goto have_space_wrap; + + /* Try to truncate the message. */ + size = truncate_msg(_len, _msg_len, _len, _len); + wsize = size + sizeof(struct printk_log); + + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= wsize) + goto have_space_no_wrap; + + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) < size) + return 0; + +have_space_wrap: + /* This message plus an additional empty header
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu, 2015-10-22 at 14:18 +0200, Jan Kara wrote: > > > > > What happened to it? And how do we fix it? > > Hard to fix since you'd easily get RCU stalls and softlockup messages on > systems with lots of CPUs and heavy printk traffic... If we have to explicitly opt in to synchronous output, then such things should be tolerable. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
On Thu, 2015-10-22 at 14:19 +0100, David Howells wrote: > Petr Mladekwrote: > > > I would expect that the first few messages are printed to the console > > before the buffer is wrapped. IMHO, in many cases, you are interested > > into the final messages that describe why the system went down. > > The last message might tell you that the machine panicked because the NMI > handler triggered due to a spinlocked section taking too long or something. > This doesn't help if the oops that caused the spinlock to remain held or > whatever gets discarded from the buffer due to several intervening complaints > that result secondarily from the initial oops. > > > If there is no time to print them, you want to have them in the crash dump > > (ring buffer) at least. > > But not at the expense of discarding the first oops report. *That* one is the > most important. > > Perhaps things could be arranged such that messages *can* be discarded from > the front of the buffer *provided* they are not oops messages. Of course, all this is a symptom of the fact that printk is now lossy. If you had an option to make it reliable again, you wouldn't need to argue over *which* messages to lose. -- dwmw2 smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
Added Andrew into CC who maintains printk() code. On Thu 2015-10-22 11:16:50, David Howells wrote: > printk() currently discards earlier messages to make space for new messages > arriving. This has the distinct downside that if the kernel starts > churning out messages because of some initial incident, the report of the > initial incident is likely to be lost under a blizzard of: > > ** NNN printk messages dropped ** > > messages from console_unlock(). > > The first message generated (typically an oops) is usually the most > important - the one you want to solve first - so we really want to see > that. oops=panic or panic_on_warn kernel parameters might be more useful in this situation. I would expect that the first few messages are printed to the console before the buffer is wrapped. IMHO, in many cases, you are interested into the final messages that describe why the system went down. If there is no time to print them, you want to have them in the crash dump (ring buffer) at least. > To this end, change log_store() to only write a message into the buffer if > there is sufficient space to hold that message. The message may be > truncated if it will then fit. > > This patch could be improved by noting that some messages got discarded > when next there is space to do so. > > Signed-off-by: David Howells> --- [...] > + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= > wsize) > + goto have_space_no_wrap; > + > + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) >= size) > + goto have_space_wrap; > + > + /* Try to truncate the message. */ > + size = truncate_msg(_len, _msg_len, _len, _len); > + wsize = size + sizeof(struct printk_log); truncate_msg() currently works only for messages that are bigger than half of the buffer. Also if the flood of messages is faster than printing, you might end up with the buffer full of "" strings with a minimum of useful information. > + if (CIRC_SPACE_TO_END(log_next_idx, log_first_idx, log_buf_len) >= > wsize) > + goto have_space_no_wrap; > + > + if (CIRC_SPACE(log_next_idx, log_first_idx, log_buf_len) < size) > + return 0; Please, try to avoid copy code. Best Regards, Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] printk: Don't discard earlier unprinted messages to make space
Petr Mladekwrote: > I would expect that the first few messages are printed to the console > before the buffer is wrapped. IMHO, in many cases, you are interested > into the final messages that describe why the system went down. The last message might tell you that the machine panicked because the NMI handler triggered due to a spinlocked section taking too long or something. This doesn't help if the oops that caused the spinlock to remain held or whatever gets discarded from the buffer due to several intervening complaints that result secondarily from the initial oops. > If there is no time to print them, you want to have them in the crash dump > (ring buffer) at least. But not at the expense of discarding the first oops report. *That* one is the most important. Perhaps things could be arranged such that messages *can* be discarded from the front of the buffer *provided* they are not oops messages. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/