Hi Paul, On Fri, Apr 27, 2018 at 8:57 AM, Paul E. McKenney <[email protected]> wrote: > On Thu, Apr 26, 2018 at 09:26:56PM -0700, Joel Fernandes wrote: >> In recent tests with IRQ on/off tracepoints, a large performance >> overhead ~10% is noticed when running hackbench. This is root caused to >> calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the >> tracepoint code. Following a long discussion on the list [1] about this, >> we concluded that srcu is a better alternative for use during rcu idle. >> Although it does involve extra barriers, its lighter than the sched-rcu >> version which has to do additional RCU calls to notify RCU idle about >> entry into RCU sections. >> >> In this patch, we change the underlying implementation of the >> trace_*_rcuidle API to use SRCU. This has shown to improve performance >> alot for the high frequency irq enable/disable tracepoints. >> >> In the future, we can add a new may_sleep API which can use this >> infrastructure for callbacks that actually can sleep which will support >> Mathieu's usecase of blocking probes. >> >> Test: Tested idle and preempt/irq tracepoints. > > Looks good overall! One question and a few comments below. > > Thanx, Paul > >> [1] https://patchwork.kernel.org/patch/10344297/ >> >> Cc: Steven Rostedt <[email protected]> >> Cc: Peter Zilstra <[email protected]> >> Cc: Ingo Molnar <[email protected]> >> Cc: Mathieu Desnoyers <[email protected]> >> Cc: Tom Zanussi <[email protected]> >> Cc: Namhyung Kim <[email protected]> >> Cc: Thomas Glexiner <[email protected]> >> Cc: Boqun Feng <[email protected]> >> Cc: Paul McKenney <[email protected]> >> Cc: Frederic Weisbecker <[email protected]> >> Cc: Randy Dunlap <[email protected]> >> Cc: Masami Hiramatsu <[email protected]> >> Cc: Fenguang Wu <[email protected]> >> Cc: Baohong Liu <[email protected]> >> Cc: Vedang Patel <[email protected]> >> Cc: [email protected] >> Signed-off-by: Joel Fernandes <[email protected]> >> --- >> include/linux/tracepoint.h | 37 +++++++++++++++++++++++++++++-------- >> kernel/tracepoint.c | 10 +++++++++- >> 2 files changed, 38 insertions(+), 9 deletions(-) >> >> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h >> index c94f466d57ef..a1c1987de423 100644 >> --- a/include/linux/tracepoint.h >> +++ b/include/linux/tracepoint.h >> @@ -15,6 +15,7 @@ >> */ >> >> #include <linux/smp.h> >> +#include <linux/srcu.h> >> #include <linux/errno.h> >> #include <linux/types.h> >> #include <linux/cpumask.h> >> @@ -33,6 +34,8 @@ struct trace_eval_map { >> >> #define TRACEPOINT_DEFAULT_PRIO 10 >> >> +extern struct srcu_struct tracepoint_srcu; >> + >> extern int >> tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); >> extern int >> @@ -77,6 +80,7 @@ int unregister_tracepoint_module_notifier(struct >> notifier_block *nb) >> */ >> static inline void tracepoint_synchronize_unregister(void) >> { >> + synchronize_srcu(&tracepoint_srcu); >> synchronize_sched(); >> } >> >> @@ -129,18 +133,26 @@ extern void syscall_unregfunc(void); >> * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just >> * "void *data", where as the DECLARE_TRACE() will pass in "void *data, >> proto". >> */ >> -#define __DO_TRACE(tp, proto, args, cond, rcucheck) \ >> +#define __DO_TRACE(tp, proto, args, cond, preempt_on) >> \ >> do { \ >> struct tracepoint_func *it_func_ptr; \ >> void *it_func; \ >> void *__data; \ >> + int __maybe_unused idx = 0; \ >> \ >> if (!(cond)) \ >> return; \ >> - if (rcucheck) \ >> - rcu_irq_enter_irqson(); \ >> - rcu_read_lock_sched_notrace(); \ >> - it_func_ptr = rcu_dereference_sched((tp)->funcs); \ >> + if (preempt_on) { \ >> + WARN_ON_ONCE(in_nmi()); /* no srcu from nmi */ \ > > Very good on this check, thank you!
Sure thing :-) > >> + idx = srcu_read_lock(&tracepoint_srcu); \ > > Hmmm... Do I need to create a _notrace variant of srcu_read_lock() > and srcu_read_unlock()? That shouldn't be needed. For the rcu_read_lock_sched case, there is a preempt_disable which needs to be a notrace, but for the srcu one, since we don't do that, I think it should be fine. Thanks! - Joel

