On 3/30/21 6:42 AM, Daniel Thompson wrote:
On Mon, Mar 29, 2021 at 03:32:35PM -0400, Waiman Long wrote:
The handling of sysrq keys should normally be done in an user context
except when MAGIC_SYSRQ_SERIAL is set and the magic sequence is typed
in a serial console.
This seems to be a poor summary of the typical calling context for
handle_sysrq() except in the trivial case of using
/proc/sysrq-trigger.
For example on my system then the backtrace when I do sysrq-h on a USB
keyboard shows us running from a softirq handler and with interrupts
locked. Note also that the interrupt lock is present even on systems that
handle keyboard input from a kthread due to the interrupt lock in
report_input_key().
I will reword this part of the patch. I don't have a deep understanding
of how the different way of keyword input work and thanks for showing me
that there are other ways of getting keyboard input.
Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is taken
with interrupt disabled for the whole duration of the calls to print_*_stats()
and print_rq() which could last for the quite some time if the information dump
happens on the serial console.
If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic, like
<snip>
The purpose of sched_debug_lock is to serialize the use of the global
cgroup_path[] buffer in print_cpu(). The rests of the printk() calls
don't need serialization from sched_debug_lock.
Calling printk() with interrupt disabled can still be/proc/sysrq-trigger
problematic. Allocating a stack buffer of PATH_MAX bytes is not
feasible. So a compromised solution is used where a small stack buffer
is allocated for pathname. If the actual pathname is short enough, it
is copied to the stack buffer with sched_debug_lock release afterward
before printk(). Otherwise, the global group_path[] buffer will be
used with sched_debug_lock held until after printk().
Does this actually fix the problem in any circumstance except when the
sysrq is triggered using /proc/sysrq-trigger?
I have a reproducer that generates hard lockup panic when there are
multiple instances of sysrq-t via /proc/sysrq-trigger. This is probably
less a problem on console as I don't think we can do multiple
simultaneous sysrq-t there. Anyway, my goal is to limit the amount of
time that irq is disabled. Doing a printk can take a while depending on
whether there are contention in the underlying locks or resources. Even
if I limit the the critical sections to just those printk() that outputs
cgroup path, I can still cause the panic.
Cheers,
Longman
The approach used by this patch should minimize the chance of a panic
happening. However, if there are many tasks with very long cgroup paths,
I suppose that panic may still happen under some extreme conditions. So
I won't say this will completely fix the problem until the printk()
rework that makes printk work more like printk_deferred() is merged.