Re: [PATCH v2] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

Waiman Long Tue, 30 Mar 2021 10:44:38 -0700

On 3/30/21 6:42 AM, Daniel Thompson wrote:

On Mon, Mar 29, 2021 at 03:32:35PM -0400, Waiman Long wrote:

The handling of sysrq keys should normally be done in an user context
except when MAGIC_SYSRQ_SERIAL is set and the magic sequence is typed
in a serial console.

This seems to be a poor summary of the typical calling context for
handle_sysrq() except in the trivial case of using
/proc/sysrq-trigger.


For example on my system then the backtrace when I do sysrq-h on a USB
keyboard shows us running from a softirq handler and with interrupts
locked. Note also that the interrupt lock is present even on systems that
handle keyboard input from a kthread due to the interrupt lock in
report_input_key().

I will reword this part of the patch. I don't have a deep understandingof how the different way of keyword input work and thanks for showing methat there are other ways of getting keyboard input.

Currently in print_cpu() of kernel/sched/debug.c, sched_debug_lock is taken
with interrupt disabled for the whole duration of the calls to print_*_stats()
and print_rq() which could last for the quite some time if the information dump
happens on the serial console.

If the system has many cpus and the sched_debug_lock is somehow busy
(e.g. parallel sysrq-t), the system may hit a hard lockup panic, like

<snip>

The purpose of sched_debug_lock is to serialize the use of the global
cgroup_path[] buffer in print_cpu(). The rests of the printk() calls
don't need serialization from sched_debug_lock.

Calling printk() with interrupt disabled can still be/proc/sysrq-trigger
problematic. Allocating a stack buffer of PATH_MAX bytes is not
feasible. So a compromised solution is used where a small stack buffer
is allocated for pathname. If the actual pathname is short enough, it
is copied to the stack buffer with sched_debug_lock release afterward
before printk().  Otherwise, the global group_path[] buffer will be
used with sched_debug_lock held until after printk().

Does this actually fix the problem in any circumstance except when the
sysrq is triggered using /proc/sysrq-trigger?

I have a reproducer that generates hard lockup panic when there aremultiple instances of sysrq-t via /proc/sysrq-trigger. This is probablyless a problem on console as I don't think we can do multiplesimultaneous sysrq-t there. Anyway, my goal is to limit the amount oftime that irq is disabled. Doing a printk can take a while depending onwhether there are contention in the underlying locks or resources. Evenif I limit the the critical sections to just those printk() that outputscgroup path, I can still cause the panic.


Cheers,
Longman

The approach used by this patch should minimize the chance of a panichappening. However, if there are many tasks with very long cgroup paths,I suppose that panic may still happen under some extreme conditions. SoI won't say this will completely fix the problem until the printk()rework that makes printk work more like printk_deferred() is merged.

Re: [PATCH v2] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

Reply via email to