Re: [PATCH printk v3 00/40] reduce console_lock scope
On 2022-11-07 09:15, John Ogness wrote: [...] The base commit for this series is from Paul McKenney's RCU tree and provides an NMI-safe SRCU implementation [1]. Without the NMI-safe SRCU implementation, this series is not less safe than mainline. But we will need the NMI-safe SRCU implementation for atomic consoles anyway, so we might as well get it in now. Especially since it _does_ increase the reliability for mainline in the panic path. So, your email got me to review the SRCU nmi-safe series: [1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.10.21a Especially this commit: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=srcunmisafe.2022.10.21a&id=5d0f5953b60f5f7a278085b55ddc73e2932f4c33 I disagree with the overall approach taken there, which is to create yet another SRCU flavor, this time with explicit "nmi-safe" read-locks. This adds complexity to the kernel APIs and I think we can be clever about this and make SRCU nmi-safe without requiring a whole new incompatible API. You can find the basic idea needed to achieve this in the libside RCU user-space implementation. I needed to introduce a split-counter concept to support rseq vs atomics to keep track of per-cpu grace period counters. The "rseq" counter is the fast-path, but if rseq fails, the abort handler uses the atomic counter instead. https://github.com/compudj/side/blob/main/src/rcu.h#L23 struct side_rcu_percpu_count { uintptr_t begin; uintptr_t rseq_begin; uintptr_t end; uintptr_t rseq_end; } __attribute__((__aligned__(SIDE_CACHE_LINE_SIZE))); The idea is to "split" each percpu counter into two counters, one for rseq, and the other for atomics. When a grace period wants to observe the value of a percpu counter, it simply sums the two counters: https://github.com/compudj/side/blob/main/src/rcu.c#L112 The same idea can be applied to SRCU in the kernel: one counter for percpu ops, and the other counter for nmi context, so basically: srcu_read_lock() if (likely(!in_nmi())) increment the percpu-ops lock counter else increment the atomic lock counter srcu_read_unlock() if (likely(!in_nmi())) increment the percpu-ops unlock counter else increment the atomic unlock counter Then in the grace period sum the percpu-ops and the atomic values whenever each counter value is read. This would allow SRCU to be NMI-safe without requiring the callers to explicitly state whether they need to be nmi-safe or not, and would only take the overhead of the atomics in the NMI handlers rather than for all users which happen to use SRCU read locks shared with nmi handlers. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
[PATCH printk v3 00/40] reduce console_lock scope
This is v3 of a series to prepare for threaded/atomic printing. v2 is here [0]. This series focuses on reducing the scope of the BKL console_lock. It achieves this by switching to SRCU and a dedicated mutex for console list iteration and modification, respectively. The console_lock will no longer offer this protection and is completely removed from (un)register_console() and console_stop/start() code. Also, during the review of v2 it came to our attention that many console drivers are checking CON_ENABLED to see if they are registered. Because this flag can change without unregistering and because this flag does not represent an atomic point when an (un)registration process is complete, a new console_is_registered() function is introduced. This function uses the console_list_lock to synchronize with the (un)registration process to provide a reliable status. All users of the console_lock for list iteration have been modified. For the call sites where the console_lock is still needed (because of other reasons), comments are added to explain exactly why the console_lock was needed. All users of CON_ENABLED for registration status have been modified to use console_is_registered(). Note that there are still users of CON_ENABLED, but this is for legitimate purposes about a registered console being able to print. The base commit for this series is from Paul McKenney's RCU tree and provides an NMI-safe SRCU implementation [1]. Without the NMI-safe SRCU implementation, this series is not less safe than mainline. But we will need the NMI-safe SRCU implementation for atomic consoles anyway, so we might as well get it in now. Especially since it _does_ increase the reliability for mainline in the panic path. Changes since v3: general: - introduce a synchronized console_is_registered() to query if a console is registered, meant to replace CON_ENABLED (mis)use for this purpose - directly read console->flags for registered consoles if it is race-free (and document that it is so) - replace uart_console_enabled() with a new uart_console_registered() based on console_is_registered() - change comments about why console_lock is used to synchronize console->device() by providing an example registration check fixups: - the following drivers were modified to use the new console_is_registered() instead of CON_ENABLED checks - arch/m68k/emu/nfcon.c - drivers/firmware/efi/earlycon.c - drivers/net/netconsole.c - drivers/tty/hvc/hvc_console.c - drivers/tty/serial/8250/8250_core.c - drivers/tty/serial/earlycon.c - drivers/tty/serial/pic32_uart.c - drivers/tty/serial/samsung_tty.c - drivers/tty/serial/serial_core.c - drivers/tty/serial/xilinx_uartps.c - drivers/usb/early/xhci-dbc.c um: kmsg_dumper: - change stdout dump criteria to match original intention kgdb/kdb: - in configure_kgdboc(), take console_list_lock to synchronize tty_find_polling_driver() against register_console() - add comments explaining why calling console->write() without locking might work tty: sh-sci: - use a setup() callback to setup the early console fbdev: xen: - implement a cleaner approach for console_force_preferred_locked() rcu: - implement debug_lockdep_rcu_enabled() for !CONFIG_DEBUG_LOCK_ALLOC printk: - check CONFIG_DEBUG_LOCK_ALLOC for srcu_read_lock_held() availability - for console_lock/_trylock/_unlock, replace "lock the console system" language with "block the console subsystem from printing" - use WRITE_ONCE() for updating console->flags of registered consoles - expand comments of synchronize_srcu() calls to explain why they are needed, and also expand comments to explain when it is not needed - change CON_BOOT consoles to always begin at earliest message - for non-BOOT/non-PRINTBUFFER consoles, initialize @seq to the minimal @seq of any of the enabled boot consoles - add comments and lockdep assertion to unregister_console_locked() because it is not clear from the name which lock is implied - dropped patches that caused unnecessary churn in the series John Ogness [0] https://lore.kernel.org/lkml/20221019145600.1282823-1-john.ogn...@linutronix.de [1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.10.21a John Ogness (38): rcu: implement lockdep_rcu_enabled for !CONFIG_DEBUG_LOCK_ALLOC printk: Prepare for SRCU console list protection printk: fix setting first seq for consoles um: kmsg_dump: only dump when no output console available console: introduce console_is_enabled() wrapper printk: use console_is_enabled() um: kmsg_dump: use console_is_enabled() kdb: kdb_io: use console_is_enabled() um: kmsg_dumper: use srcu console list iterator tty: serial: kgdboc: document console_lock usage tty: tty_io: document console_lock usage proc: consoles: document console_lock usage kdb: use srcu console list iterator printk: console_flush_all: use srcu console list iterator printk: c