Hello,

On (10/27/16 20:30), Linus Torvalds wrote:
> On Thu, Oct 27, 2016 at 8:49 AM, Sergey Senozhatsky
> <sergey.senozhat...@gmail.com> wrote:
> >
> >         RFC
> >
> >         This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> 
> This looks very reasonable to me.
> 
> Does this also obviate the need for "printk_deferred()" that the
> scheduler and the clock code uses?  Because that would be a lovely
> thing to look at if it doesn't..

I wish I could say that we can retire printk_deferred(), but no, we still
need it. it's rather simple to fix printk recursion (that's what the patch
set is doing), but printk deadlocks are much harder to handle. anything that
starts somewhere else but somehow is related printk will deadlock (in the
worst case). I use this backtrace as an example:

 SyS_ioctl
  do_vfs_ioctl
   tty_ioctl
    n_tty_ioctl
     tty_mode_ioctl
      set_termios
       tty_set_termios
        uart_set_termios
         uart_change_speed
          FOO_serial_set_termios
           spin_lock_irqsave(&port->lock)     // lock the output port
           ....
           !! WARN() or pr_err() or printk()
               vprintk_emit()
                /* console_trylock() */
                console_unlock()
                 call_console_drivers()
                  FOO_write()
                   spin_lock_irqsave(&port->lock)     // already locked

with the current printk we can't tell for sure how many locks will
be acquired -- printk() can succeed in locking the console_sem and
start invoking console drivers (if any) from console_unlock(), or
it can fail thus we will acquire only logbuf spin_lock and console_sem
spin_lock.

the things can change *a bit* once we switch to async_printk. because
instead of doing console_unlock()->call_console_drivers(), printk()
will just wake_up() the printk_kthread. but still, it won't be enough
to remove printk_deferred()   :(

       vprintk_emit()
        wake_up()
         spin_lock rq lock
          printk

will be safe. but

      wake_up()
       spin_lock rq lock
        printk
         vprintk_emit()
          wake_up()
           spin_lock rq lock

will deadlock.

we can't even tell for sure what locks are "important" to printk().
a small and reasonable code refactoring somewhere in clock code/etc.
can accidentally change the whole picture by introducing "unsafe"
WARN_ON() or adding yet another lock to the printing path.

need to think more.


p.s.
we are plannig to discuss printk related issues next week in Santa Fe.

        -ss

Reply via email to