Hello, On (07/25/17 06:57), Greg KH wrote: > On Tue, Jul 25, 2017 at 10:28:16AM +0200, Michael Wang wrote: > > Hi, greg k-h > > > > During our testing with 4.4.73 we got soft lockup like: > > > > NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856] > > ... > > Call Trace: > > [<ffffffff8109d139>] vprintk_emit+0x319/0x4a0 > > [<ffffffff8112182d>] printk_emit+0x33/0x3b > > [<ffffffff812f9e9c>] ? simple_strtoull+0x2c/0x50 > > [<ffffffff8109d39a>] devkmsg_write+0xaa/0x100 > > [<ffffffff8109d2f0>] ? vprintk+0x30/0x30 > > [<ffffffff811915f2>] do_readv_writev+0x1c2/0x270 > > [<ffffffff8117899d>] ? kmem_cache_free+0x7d/0x1a0 > > [<ffffffff81191729>] vfs_writev+0x39/0x50 > > [<ffffffff8119240a>] SyS_writev+0x4a/0xd0 > > [<ffffffff8158bc97>] entry_SYSCALL_64_fastpath+0x12/0x6a > > > > Currently in 4.4 the console_unlock() called by vprintk_emit() is with > > preemption disabled, so the cond_resched is not working, and soft lockup > > appear if it take too much time on writing data into every console. > > > > We found the upstream patch: > > commit 6b97a20d3a79 printk: set may_schedule for some of > > console_trylock() callers > > > > which should have addressed this issue, but not included in the latest > > 4.4.78 stable > > yet, is there any plan on backport it in future? > > That commit was in the 4.6 kernel release, so odds are, it's not on a > list of patches to backport any time soon :) > > But, you can ask for it to be included, if you cc: the stable@vger list, > like I did here, and add the authors of the patch to the cc: asking if > they have any objection for it being backported (also added here...) > > Sergey, any objection for me backporting this?
the uncomfortable truth is that the patch will not address all of the scenarios that can lead to lockups. printk() and console_unlock() are ugly beasts and we are still working on a proper solution; it's a bit harder that I though, but we are close. (for more details https://lwn.net/Articles/705938/) there are still tons of context and cases when printk() _can_ end up in console_unlock() and spend there uncontrollable amount of time (as long as other CPUs will simultaneously append new messages to the logbuf): spin_lock() printk() -> console_unlock() spin_unlock() spin_lock_irqsave() printk() -> console_unlock() spin_unlock_irqrestore() preempt_disable() printk() -> console_unlock() preempt_enable() local_irq_disable() printk() -> console_unlock() local_irq_restore() rcu_read_lock() printk() -> console_unlock() rcu_read_unlock() IRQ-> printk() -> console_unlock() printk_deferred() IRQ-> printk() -> console_unlock() and so on. in all those cases, printing CPU does not return from console_unlock() as long as there are pending messages left; if other CPUs printk() a lot, then printing CPU loops in console_unlock(), regardless the context and circumstances. which is often times result in lockups, panics, etc. returning back to the patch, there are two things related to it I can quickly think of: 1) the patch needs a fix-up commit 257ab443118bffc ("printk: Correctly handle preemption in console_unlock()") 2) it may affect users this is a bit weird... but, in fact, console_unlock() must always run with disabled preemption. otherwise, we schedule with the console_sem locked, which is risky and pointless. executing console_unlock() with preemption disabled is, obviously, dangerous, that's why consle_unlock() should stop doing the endless loop and instead must offload (after some time) the printing duty to another kthread/CPU. more closer to the point, we've got reports that in some cases due to additional points of scheduling in console_unlock() doing massive print outs (like showing backtraces of all tasks, or OOM dump) now takes more time (it really depends on the conditions). so the patch addresses some of trivial lockup cases and in general not that bad, but at the same time it's not a complete solution and it has some side effects. -ss