Hello, On (01/20/16 13:31), Petr Mladek wrote: [..] > > console_may_schedule = !oops_in_progress && preemptible() && > > !rcu_preempt_depth(); > > I though about it from some other side and I wonder if we need all > the console_may_schedule stuff at all.
interesting, thanks. > printk_emit() has the following code: > > /* This stops the holder of console_sem just where we want him */ > local_irq_save(flags); > [...] > lockdep_off(); > raw_spin_lock(&logbuf_lock); > logbuf_cpu = this_cpu; > > [...] > > logbuf_cpu = UINT_MAX; > raw_spin_unlock(&logbuf_lock); > lockdep_on(); > local_irq_restore(flags); > > /* If called from the scheduler, we can not call up(). */ > if (!in_sched) { > lockdep_off(); > /* > * Disable preemption to avoid being preempted while holding > * console_sem which would prevent anyone from printing to > * console > */ > preempt_disable(); > > /* > * Try to acquire and then immediately release the console > * semaphore. The release will print out buffers and wake up > * /dev/kmsg and syslog() users. > */ > if (console_trylock_for_printk()) > console_unlock(); > preempt_enable(); > > > First, the message "This stops the holder of console_sem just where we > want him" is suspitious. this comment is irrelevant, as of today. it was, a long time ago, because the entire thing was a bit different (linux-2.4.21 kernel/printk.c) /* This stops the holder of console_sem just where we want him */ spin_lock_irqsave(&logbuf_lock, flags); logbuf_lock does stop the holder, local_irq_save() does not, you are right. > It is there sice the initial git commit on 2005-04-16. I do not > understand how this could block the console holder on another CPU. > I think that it rather allows to make the recursion detection without > the lockbuf lock. But this is not > that important. > > More interesting is the counter part: > > raw_spin_unlock(&logbuf_lock); > lockdep_on(); > local_irq_restore(flags); > > raw_spin_unlock() calls preempt_enable() that calls > preempt_schedule(). It does nothing here because IRQs > are still disabled. > > But if we do not need the lockdep_on() hack, we could use > raw_spin_unlock_irqrestore(&lockbuf_lock, flags). It means > that we do not call cond_resched here "only by chance". > > In each case, preemtible kernel seems to be free to reschedule > after we enable the inrerrupts. By other words, preemptible kernel > seems to be able to reschedule in printk() already these days. > Why it might work? > > I think that it might work because cond_resched() or rather > preempt_schedule() are clever enough. They both check > preempt_count and do nothing if preemtion is disabled. is cond_resched() clever enough for vprintk_emit()->onsole_unlock()? I think it can schedule from rcu read critical section, for example. rcu_read_lock printk vprintk_emit console_unlock cond_resched << will schedule rcu_read_unlock because _cond_resched()->should_resched(0) test unlikely(preempt_count() == preempt_offset && tif_need_resched()); it does care about preempt_disable (if the kernel is CONFIG_PREEMPT_COUNT, though) and IRQ count, but no rcu preempt count check. and, I think, we also don't want to schedule when we are in oops_in_progress. so we can (probably...) get rid of console_may_schedule, but I don't think we can unconditionally cond_resched() from console_unlock(). > As a result, I think that we do not need the extra checks > for the save context in printk(). IMHO, it is safe to remove > all the console_may_schedule stuff and also remove the extra > preempt_disable/preempt_enable() in vprintk_emit(). > > Or did I miss anything? hm... I suspect the reason we have console_may_schedule is console_conditional_schedule() - console_sem owner may want to have an internal logic to re-schedule [fwiw], while still holding the console_sem. tty/vt/vt.c or video/console/fbcon.c for example. (in 2.4 kernel: video/fbcon.c and char/console.c). cond_resched() helps in console_unlock(); console_conditional_schedule() is called after console_lock() and _before_ console_unlock().... static void fbcon_redraw_move(struct vc_data *vc, struct display *p, int line, int count, int dy) { unsigned short *s = (unsigned short *) (vc->vc_origin + vc->vc_size_row * line); while (count--) { unsigned short *start = s; unsigned short *le = advance_row(s, 1); unsigned short c; int x = 0; unsigned short attr = 1; do { c = scr_readw(s); if (attr != (c & 0xff00)) { attr = c & 0xff00; if (s > start) { fbcon_putcs(vc, start, s - start, dy, x); x += s - start; start = s; } } console_conditional_schedule(); s++; } while (s < le); if (s > start) fbcon_putcs(vc, start, s - start, dy, x); console_conditional_schedule(); dy++; } } dunno... for non-preempt kernels, perhaps? -ss > Best Regards, > Petr >