On Mon 2017-03-06 21:45:52, Sergey Senozhatsky wrote: > Offload printing of printk_deferred() messages from IRQ context > to a schedulable printing kthread, when possible (the same way > we do it in vprintk_emit()). Otherwise, console_unlock() can > force the printing CPU to spend unbound amount of time flushing > kernel messages from IRQ context. > > Signed-off-by: Sergey Senozhatsky <sergey.senozhat...@gmail.com> > --- > kernel/printk/printk.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 1c4232ca2e6a..6e00073a7331 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -2735,9 +2735,16 @@ static void wake_up_klogd_work_func(struct irq_work > *irq_work) > int pending = __this_cpu_xchg(printk_pending, 0); > > if (pending & PRINTK_PENDING_OUTPUT) { > - /* If trylock fails, someone else is doing the printing */ > - if (console_trylock()) > - console_unlock(); > + if (printk_kthread_enabled()) { > + wake_up_process(printk_kthread);
I have just noticed a possible race. printk_deferred() does not set printk_kthread_need_flush_console and there might stay a pending job: CPU0 CPU1 printk_kthread_func() printk_kthread_need_flush_console = false; console_lock() console_unlock() printk_deferred() vprintk_emit() irq_work_queue() <IRQ> wake_up_klogd_work_func() if (printk_kthread_enabled()) wake_up_process(printk_kthread); set_current_state(TASK_INTERRUPTIBLE); if (!printk_kthread_need_flush_console) schedule(); Result: printk_kthread goes to sleep even though there is a pending job. A solution might be to rename the variable to something like printk_pending_output, always set it in vprintk_emit() and clear it in console_unlock() when there are no pending messages. I think that we have already discussed this in the past. This solution would also remove one extra cycle if more messages are handled by one console_unlock() call: CPU0 CPU1 printk() vprintk_emit() printk_kthread_need_flush_console = true; wake_up_process(printk_kthread) <printk_kthread> printk_kthread_need_flush_console = false; console_lock() printk() vprintk_emit() printk_kthread_need_flush_console = true; wake_up_process(printk_kthread) console_unlock() set_current_state(TASK_INTERRUPTIBLE); if (!printk_kthread_need_flush_console) <fail> _set_current_state(TASK_RUNNING); console_lock() console_unlock() Result: The second console_unlock() has nothing to do. If I remember correctly, you were not much happy with this solution because it did spread the logic. I think that you did not believe that it was worth fixing the second problem. But fixing the race might need to spread the logic as well. I see it the following way. vprintk_emit() is a producer, console_unlock() is a consumer, and printk_thread is a room that allows consumer to do its job. The consumer has more rooms available. The state variable is a flag showing that there is a pending job, consumer is looking for a room, and printk_kthread should offer it. Of course, it is possible that you will find a better solution. Best Regards, Petr