Folks! While working on various preempt count related things, I stumbled (again) over the inconsistency of our preempt count handling.
The handling of preempt_count() is inconsistent accross kernel configurations. On kernels which have PREEMPT_COUNT=n preempt_disable/enable() and the lock/unlock functions are not affecting the preempt count, only local_bh_disable/enable() and _bh variants of locking, soft interrupt delivery, hard interrupt and NMI context affect it. It's therefore impossible to have a consistent set of checks which provide information about the context in which a function is called. In many cases it makes sense to have seperate functions for seperate contexts, but there are valid reasons to avoid that and handle different calling contexts conditionally. The lack of such indicators which work on all kernel configuratios is a constant source of trouble because developers either do not understand the implications or try to work around this inconsistency in weird ways. Neither seem these issues be catched by reviewers and testing. Recently merged code does: gfp = preemptible() ? GFP_KERNEL : GFP_ATOMIC; Looks obviously correct, except for the fact that preemptible() is unconditionally false for CONFIF_PREEMPT_COUNT=n, i.e. all allocations in that code use GFP_ATOMIC on such kernels. Attempts to make preempt count unconditional and consistent have been rejected in the past with handwaving performance arguments. Freshly conducted benchmarks did not reveal any measurable impact from enabling preempt count unconditionally. On kernels with CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY the preempt count is only incremented and decremented but the result of the decrement is not tested. Contrary to that enabling CONFIG_PREEMPT which tests the result has a small but measurable impact due to the conditional branch/call. It's about time to make essential functionality of the kernel consistent accross the various preemption models. The series is also available from git: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git preempt That's the first part of a larger effort related to preempt count: 1) The analysis of the usage sites of in_interrupt(), in_atomic(), in_softirq() is still ongoing, but so far the number of buggy users is clearly the vast majority. There will be seperate patch series (currently 46 and counting) to address these issues once the analysis is complete in the next days. 2) The long discussed state tracking of local irq disable in preempt count which accounts interrupt disabled sections as atomic and avoids issuing costly instructions (sti, cli, popf or their non X86 counterparts) when the state does not change, i.e. nested irq_save() or irq_restore(). I have this working on X86 already and contrary to my earlier attempts this was reasonably straight forward due to the recent entry/exit code consolidation. What I've not done yet is to optimize the preempt count handling of the [un]lock_irq* operations so they handle the interrupt disabled state and the preempt count modification in one go. That's an obvious add on, but correctness first ... 3) Lazy interrupt disabling as a straight forward extension to #2. This avoids the actual disabling at the CPU level completely and catches an incoming interrupt in the low level entry code, modifies the interrupt disabled state on the return stack, notes the interrupt as pending in software and raises it again when interrupts are reenabled. This has still a few issues which I'm hunting down (cpuidle is unhappy ...) Thanks, tglx --- arch/arm/include/asm/assembler.h | 11 -- arch/arm/kernel/iwmmxt.S | 2 arch/arm/mach-ep93xx/crunch-bits.S | 2 arch/xtensa/kernel/entry.S | 2 drivers/gpu/drm/i915/Kconfig.debug | 1 drivers/gpu/drm/i915/i915_utils.h | 3 include/linux/bit_spinlock.h | 4 - include/linux/lockdep.h | 6 - include/linux/pagemap.h | 4 - include/linux/preempt.h | 37 +--------- include/linux/uaccess.h | 6 - kernel/Kconfig.preempt | 4 - kernel/sched/core.c | 6 - lib/Kconfig.debug | 3 lib/Kconfig.debug.rej | 14 +-- tools/testing/selftests/rcutorture/configs/rcu/SRCU-t | 1 tools/testing/selftests/rcutorture/configs/rcu/SRCU-u | 1 tools/testing/selftests/rcutorture/configs/rcu/TINY01 | 1 tools/testing/selftests/rcutorture/doc/TINY_RCU.txt | 5 - tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt | 1 tools/testing/selftests/rcutorture/formal/srcu-cbmc/src/config.h | 1 21 files changed, 23 insertions(+), 92 deletions(-) _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx