On Thu 2019-07-18 18:49:34, Sergey Senozhatsky wrote: > On (07/18/19 10:36), Petr Mladek wrote: > > On Wed 2019-07-17 18:56:15, Sergey Senozhatsky wrote: > > > On (07/16/19 09:28), Petr Mladek wrote: > > > > Kernel tries hard to store and show printk messages when panicking. Even > > > > logbuf_lock gets re-initialized when only one CPU is running after > > > > smp_send_stop(). > > > > > > > > Unfortunately, smp_send_stop() might fail on architectures that do not > > > > use NMI as a fallback. Then printk log buffer might stay locked and > > > > a deadlock is almost inevitable. > > > > > > I'd say that deadlock is still almost inevitable. > > > > > > panic-CPU syncs with the printing-CPU before it attempts to SMP_STOP. > > > If there is an active printing-CPU, which is looping in console_unlock(), > > > taking logbuf_lock in order to msg_print_text() and stuff, then panic-CPU > > > will spin on console_owner waiting for that printing-CPU to handover > > > printing duties. > > > > > > pr_emerg("Kernel panic - not syncing"); > > > smp_send_stop(); > > > > Good point. I forgot the handover logic. Well, it is enabled only > > around call_console_drivers(). Therefore it is not under > > lockbuf_lock. > > > > I had in mind some infinite loop or deadlock in vprintk_store(). > > There was at least one long time ago (warning triggered > > by leap second). > > > > > > > If printing-CPU goes nuts under logbuf_lock, has corrupted IDT or anything > > > else, then we will not progress with panic(). panic-CPU will deadlock. If > > > not on > > > pr_emerg("Kernel panic - not syncing") > > > > > > then on another pr_emerg(), right before the NMI-fallback. > > > > Nested printk() should not be problem thanks to printk_safe. > > Where do nested printk()-s come from? Which one of the following > scenarios you cover in commit message: > > scenario 1 > > - we have CPUB which holds logbuf_lock > - we have CPUA which panic()-s the system, but can't bring CPUB down, > so logbuf_lock stays locked on remote CPU
No, this scenario is not affected by this patch. It would always lead to a deadlock. > scenario 2 > > - we have CPUA which holds logbuf_lock > - we have panic() on CPUA, but it cannot bring down some other CPUB > so logbuf_lock stays locked on local CPU, and it cannot re-init > logbuf. This scenario should get better handled by this patch. The difference will be when smp_send_stop() is not able to stop all CPUs: + Before: + printk_safe_flush_on_panic() will keep logbuf_lock locked and do nothing. + kmsg_dump(), console_unblank(), or console_flush_on_panic() will deadlock when they try to get logbuf_lock(). They will not be able to process any single line. + After: + printk_bust_lock_safe() will keep logbuf_lock locked + All functions using logbuf_lock will not get called. We will not see the messages (as previously) but the system will not deadlock. But there is one more scenario 3: - we have CPUB which loops or is deadlocked in IRQ context - we have CPUA which panic()-s the system, but can't bring CPUB down, so logbuf_lock might be takes and release from time to time by CPUB Hmm, this scenario might be handled a bit _worse_ by this patch: + Before: + printk_safe_flush_on_panic() will not touch logbuf_lock The messages will get flushed according to the state of logbuf_lock at the moment when it is being checked. + kmsg_dump(), console_unblank(), or console_flush_on_panic() will be able to do their job. + After: + printk_safe_flush_on_panic(), kmsg_dump(), console_unblank(), and console_flush_on_panic() could finish the job. But they will get called _only_ when logbuf_lock is released at the moment when it is being checked by printk_bust_lock_safe(). Resume: >From my POV, the 3rd scenario is the most likely one. Therefore this patch would make more bad than good. It might be possible to somehow detect if lockbuf_lock is released from time the time on the non-stopped CPU. But it would be hairy. IMHO, it is not worth it. Thanks a lot for helping me to sort the ideas. I suggest to forget this patch and work on lockless ringbuffer. Best Regards, Petr