On Thu, Jun 02, 2016 at 01:51:41PM -0300, Daniel Bristot de Oliveira wrote: > It is not always easy to define the cause of an RCU stall just by > analysing the RCU stall messages, mainly when the problem is caused > by the indirect starvation of rcu threads. For example, when preempt_rcu > is not awakened due to the starvation of a timer softirq. > > We have been hard coding panic() in the RCU stall functions for > some time while testing the kernel-rt. But this is not possible in > some scenarios, like when supporting customers. > > This patch implements the sysctl kernel.panic_on_rcu_stall. If > set to 1, the system will panic() when an RCU stall takes place, > enabling the capture of a vmcore. The vmcore provides a way to analyze > all kernel/tasks states, helping out to point to the culprit and the > solution for the stall. > > The kernel.panic_on_rcu_stall sysctl is disabled by default. > > Changes from v1: > - Fixed a typo in the git log > - The if(sysctl_panic_on_rcu_stall) panic() is in a static function > - Fixed the CONFIG_TINY_RCU compilation issue > - The var sysctl_panic_on_rcu_stall is now __read_mostly > > Cc: Jonathan Corbet <cor...@lwn.net> > Cc: "Paul E. McKenney" <paul...@linux.vnet.ibm.com> > Cc: Josh Triplett <j...@joshtriplett.org> > Cc: Steven Rostedt <rost...@goodmis.org> > Cc: Mathieu Desnoyers <mathieu.desnoy...@efficios.com> > Cc: Lai Jiangshan <jiangshan...@gmail.com> > Acked-by: Christian Borntraeger <borntrae...@de.ibm.com> > Reviewed-by: Josh Triplett <j...@joshtriplett.org> > Reviewed-by: Arnaldo Carvalho de Melo <a...@kernel.org> > Tested-by: "Luis Claudio R. Goncalves" <lgonc...@redhat.com> > Signed-off-by: Daniel Bristot de Oliveira <bris...@redhat.com>
Queued for testing and further review. Thanx, Paul > --- > Documentation/sysctl/kernel.txt | 12 ++++++++++++ > include/linux/kernel.h | 1 + > kernel/rcu/tree.c | 12 ++++++++++++ > kernel/sysctl.c | 11 +++++++++++ > 4 files changed, 36 insertions(+) > > diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt > index a3683ce..3320460 100644 > --- a/Documentation/sysctl/kernel.txt > +++ b/Documentation/sysctl/kernel.txt > @@ -58,6 +58,7 @@ show up in /proc/sys/kernel: > - panic_on_stackoverflow > - panic_on_unrecovered_nmi > - panic_on_warn > +- panic_on_rcu_stall > - perf_cpu_time_max_percent > - perf_event_paranoid > - perf_event_max_stack > @@ -618,6 +619,17 @@ a kernel rebuild when attempting to kdump at the > location of a WARN(). > > ============================================================== > > +panic_on_rcu_stall: > + > +When set to 1, calls panic() after RCU stall detection messages. This > +is useful to define the root cause of RCU stalls using a vmcore. > + > +0: do not panic() when RCU stall takes place, default behavior. > + > +1: panic() after printing RCU stall messages. > + > +============================================================== > + > perf_cpu_time_max_percent: > > Hints to the kernel how much CPU time it should be allowed to > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 94aa10f..c420821 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -451,6 +451,7 @@ extern int panic_on_oops; > extern int panic_on_unrecovered_nmi; > extern int panic_on_io_nmi; > extern int panic_on_warn; > +extern int sysctl_panic_on_rcu_stall; > extern int sysctl_panic_on_stackoverflow; > > extern bool crash_kexec_post_notifiers; > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index c7f1bc4..d531988 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -125,6 +125,8 @@ int rcu_num_lvls __read_mostly = RCU_NUM_LVLS; > /* Number of rcu_nodes at specified level. */ > static int num_rcu_lvl[] = NUM_RCU_LVL_INIT; > int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in > use. */ > +/* panic() on RCU Stall sysctl. */ > +int sysctl_panic_on_rcu_stall __read_mostly; > > /* > * The rcu_scheduler_active variable transitions from zero to one just > @@ -1311,6 +1313,12 @@ static void rcu_stall_kick_kthreads(struct rcu_state > *rsp) > } > } > > +static inline void panic_on_rcu_stall(void) > +{ > + if (sysctl_panic_on_rcu_stall) > + panic("RCU Stall\n"); > +} > + > static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum) > { > int cpu; > @@ -1390,6 +1398,8 @@ static void print_other_cpu_stall(struct rcu_state > *rsp, unsigned long gpnum) > > rcu_check_gp_kthread_starvation(rsp); > > + panic_on_rcu_stall(); > + > force_quiescent_state(rsp); /* Kick them all. */ > } > > @@ -1430,6 +1440,8 @@ static void print_cpu_stall(struct rcu_state *rsp) > jiffies + 3 * rcu_jiffies_till_stall_check() + 3); > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > + panic_on_rcu_stall(); > + > /* > * Attempt to revive the RCU machinery by forcing a context switch. > * > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index 87b2fc3..35f0dcb 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -1205,6 +1205,17 @@ static struct ctl_table kern_table[] = { > .extra2 = &one, > }, > #endif > +#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU) > + { > + .procname = "panic_on_rcu_stall", > + .data = &sysctl_panic_on_rcu_stall, > + .maxlen = sizeof(sysctl_panic_on_rcu_stall), > + .mode = 0644, > + .proc_handler = proc_dointvec_minmax, > + .extra1 = &zero, > + .extra2 = &one, > + }, > +#endif > { } > }; > > -- > 2.5.5 >