Hello,

On (07/25/17 06:57), Greg KH wrote:
> On Tue, Jul 25, 2017 at 10:28:16AM +0200, Michael Wang wrote:
> > Hi, greg k-h
> > 
> > During our testing with 4.4.73 we got soft lockup like:
> > 
> >  NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [systemd-udevd:856]
> >  ...
> >  Call Trace:
> >   [<ffffffff8109d139>] vprintk_emit+0x319/0x4a0
> >   [<ffffffff8112182d>] printk_emit+0x33/0x3b
> >   [<ffffffff812f9e9c>] ? simple_strtoull+0x2c/0x50 
> >   [<ffffffff8109d39a>] devkmsg_write+0xaa/0x100
> >   [<ffffffff8109d2f0>] ? vprintk+0x30/0x30
> >   [<ffffffff811915f2>] do_readv_writev+0x1c2/0x270 
> >   [<ffffffff8117899d>] ? kmem_cache_free+0x7d/0x1a0
> >   [<ffffffff81191729>] vfs_writev+0x39/0x50
> >   [<ffffffff8119240a>] SyS_writev+0x4a/0xd0
> >   [<ffffffff8158bc97>] entry_SYSCALL_64_fastpath+0x12/0x6a
> > 
> > Currently in 4.4 the console_unlock() called by vprintk_emit() is with
> > preemption disabled, so the cond_resched is not working, and soft lockup
> > appear if it take too much time on writing data into every console.
> > 
> > We found the upstream patch:
> >   commit 6b97a20d3a79 printk: set may_schedule for some of 
> > console_trylock() callers
> > 
> > which should have addressed this issue, but not included in the latest 
> > 4.4.78 stable
> > yet, is there any plan on backport it in future?
> 
> That commit was in the 4.6 kernel release, so odds are, it's not on a
> list of patches to backport any time soon :)
> 
> But, you can ask for it to be included, if you cc: the stable@vger list,
> like I did here, and add the authors of the patch to the cc: asking if
> they have any objection for it being backported (also added here...)
> 
> Sergey, any objection for me backporting this?

the uncomfortable truth is that the patch will not address all of the
scenarios that can lead to lockups. printk() and console_unlock() are
ugly beasts and we are still working on a proper solution; it's a bit
harder that I though, but we are close.
(for more details https://lwn.net/Articles/705938/)


there are still tons of context and cases when printk() _can_ end up
in console_unlock() and spend there uncontrollable amount of time
(as long as other CPUs will simultaneously append new messages to the
logbuf):

spin_lock()
printk() -> console_unlock()
spin_unlock()

spin_lock_irqsave()
printk() -> console_unlock()
spin_unlock_irqrestore()

preempt_disable()
printk() -> console_unlock()
preempt_enable()

local_irq_disable()
printk() -> console_unlock()
local_irq_restore()

rcu_read_lock()
printk() -> console_unlock()
rcu_read_unlock()

IRQ->
        printk() -> console_unlock()

printk_deferred()
        IRQ->
                printk() -> console_unlock()


and so on. in all those cases, printing CPU does not return from
console_unlock() as long as there are pending messages left; if other
CPUs printk() a lot, then printing CPU loops in console_unlock(),
regardless the context and circumstances. which is often times
result in lockups, panics, etc.


returning back to the patch,

there are two things related to it I can quickly think of:

1) the patch needs a fix-up commit 257ab443118bffc ("printk: Correctly
   handle preemption in console_unlock()")

2) it may affect users
   this is a bit weird... but, in fact, console_unlock() must always
   run with disabled preemption. otherwise, we schedule with the
   console_sem locked, which is risky and pointless. executing
   console_unlock() with preemption disabled is, obviously, dangerous,
   that's why consle_unlock() should stop doing the endless loop and
   instead must offload (after some time) the printing duty to another
   kthread/CPU.

   more closer to the point,
   we've got reports that in some cases due to additional points of
   scheduling in console_unlock() doing massive print outs (like showing
   backtraces of all tasks, or OOM dump) now takes more time (it really
   depends on the conditions).


so the patch addresses some of trivial lockup cases and in general not
that bad, but at the same time it's not a complete solution and it has
some side effects.

        -ss

Reply via email to