On Wed, 14 Aug 2013 15:28:27 +0200 Jan Kara <j...@suse.cz> wrote: > A CPU can be caught in console_unlock() for a long time (tens of seconds > are reported by our customers) when other CPUs are using printk heavily > and serial console makes printing slow. Despite serial console drivers > are calling touch_nmi_watchdog() this triggers softlockup warnings > because interrupts are disabled for the whole time console_unlock() runs > (e.g. vprintk() calls console_unlock() with interrupts disabled). Thus > IPIs cannot be processed and other CPUs get stuck spinning in calls like > smp_call_function_many(). Also RCU eventually starts reporting lockups. > > In my artifical testing I can also easily trigger a situation when disk > disappears from the system apparently because commands to / from it > could not be delivered for long enough. This is why just silencing > watchdogs isn't a reliable solution to the problem and we simply have to > avoid spending too long in console_unlock() with interrupts disabled. > > The solution this patch works toward is to postpone printing to a later > moment / different CPU when we already printed over 1000 characters in > current console_unlock() invocation. This is a crude heuristic but > measuring time we spent printing doesn't seem to be really viable - we > cannot rely on high resolution time being available and with interrupts > disabled jiffies are not updated. The value 1000 was chosen so that > things are still bearable with 9600 baud serial console and OTOH it > shouldn't cause offloading of printing in common cases. > > Signed-off-by: Jan Kara <j...@suse.cz> > --- > kernel/printk/printk.c | 62 > ++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 52 insertions(+), 10 deletions(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 9208e17..28c220a 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -2026,31 +2026,49 @@ out: > raw_spin_unlock_irqrestore(&logbuf_lock, flags); > } > > +/* > + * How much characters can we print in one call of printk before offloading > + * printing work > + */ > +#define MAX_PRINTK_CHARS 1000
This really needs to be configurable. 1000 may be good for you, but may not be good for others. Heck, we should be able to disable this offloading too. This should probably be added to the printk /proc/sys/kernel/... files. If it is set to zero, then disable any offloading. -- Steve > + > +/* Should we stop printing on this cpu? */ > +static bool cpu_stop_printing(int printed_chars) > +{ > + /* Oops? Print everything now to maximize chances user will see it */ > + if (oops_in_progress) > + return false; > + return printed_chars > MAX_PRINTK_CHARS; > +} > + > /** -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/