Re: [PATCH] context_tracking: Add comments on interface and internals
2013/1/16 Namhyung Kim : > Hi Frederic, > > On Wed, 16 Jan 2013 13:32:57 +0100, Frederic Weisbecker wrote: >> This subsystem lacks many explanations on its purpose and >> design. Add these missing comments. >> >> v3: Fix the "hook" based naming as per Ingo's suggestion > [snip] >> +/** >> + * context_tracking_task_switch - context switch the syscall callbacks > > To be more kernel-doc-friendly, it'd better adding descriptions for > arguments too: > > @prev: the task that is being switched out > @next: the task we are going to switch to Right, I'm fixing that. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] context_tracking: Add comments on interface and internals
Hi Frederic, On Wed, 16 Jan 2013 13:32:57 +0100, Frederic Weisbecker wrote: > This subsystem lacks many explanations on its purpose and > design. Add these missing comments. > > v3: Fix the "hook" based naming as per Ingo's suggestion [snip] > +/** > + * context_tracking_task_switch - context switch the syscall callbacks To be more kernel-doc-friendly, it'd better adding descriptions for arguments too: @prev: the task that is being switched out @next: the task we are going to switch to Thanks, Namhyung > + * > + * The context tracking uses the syscall slow path to implement its > user-kernel > + * boundaries probes on syscalls. This way it doesn't impact the syscall fast > + * path on CPUs that don't do context tracking. > + * > + * But we need to clear the flag on the previous task because it may later > + * migrate to some CPU that doesn't do the context tracking. As such the TIF > + * flag may not be desired there. > + */ > void context_tracking_task_switch(struct task_struct *prev, >struct task_struct *next) > { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] context_tracking: Add comments on interface and internals
This subsystem lacks many explanations on its purpose and design. Add these missing comments. v3: Fix the "hook" based naming as per Ingo's suggestion Reported-by: Andrew Morton Signed-off-by: Frederic Weisbecker Cc: Alessio Igor Bogani Cc: Andrew Morton Cc: Chris Metcalf Cc: Christoph Lameter Cc: Geoff Levand Cc: Gilad Ben Yossef Cc: Hakan Akkan Cc: Ingo Molnar Cc: Li Zhong Cc: Namhyung Kim Cc: Paul E. McKenney Cc: Paul Gortmaker Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner --- kernel/context_tracking.c | 73 ++-- 1 files changed, 63 insertions(+), 10 deletions(-) diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index e0e07fd..4b360b4 100644 --- a/kernel/context_tracking.c +++ b/kernel/context_tracking.c @@ -1,3 +1,19 @@ +/* + * Context tracking: Probe on high level context boundaries such as kernel + * and userspace. This includes syscalls and exceptions entry/exit. + * + * This is used by RCU to remove its dependency on the timer tick while a CPU + * runs in userspace. + * + * Started by Frederic Weisbecker: + * + * Copyright (C) 2012 Red Hat, Inc., Frederic Weisbecker + * + * Many thanks to Gilad Ben-Yossef, Paul McKenney, Ingo Molnar, Andrew Morton, + * Steven Rostedt, Peter Zijlstra for suggestions and improvements. + * + */ + #include #include #include @@ -6,8 +22,8 @@ struct context_tracking { /* -* When active is false, hooks are not set to -* minimize overhead: TIF flags are cleared +* When active is false, probes are unset in order +* to minimize overhead: TIF flags are cleared * and calls to user_enter/exit are ignored. This * may be further optimized using static keys. */ @@ -24,6 +40,15 @@ static DEFINE_PER_CPU(struct context_tracking, context_tracking) = { #endif }; +/** + * user_enter - Inform the context tracking that the CPU is going to + * enter userspace mode. + * + * This function must be called right before we switch from the kernel + * to userspace, when it's guaranteed the remaining kernel instructions + * to execute won't use any RCU read side critical section because this + * function sets RCU in extended quiescent state. + */ void user_enter(void) { unsigned long flags; @@ -39,40 +64,68 @@ void user_enter(void) if (in_interrupt()) return; + /* Kernel threads aren't supposed to go to userspace */ WARN_ON_ONCE(!current->mm); local_irq_save(flags); if (__this_cpu_read(context_tracking.active) && __this_cpu_read(context_tracking.state) != IN_USER) { __this_cpu_write(context_tracking.state, IN_USER); + /* +* At this stage, only low level arch entry code remains and +* then we'll run in userspace. We can assume there won't be +* any RCU read-side critical section until the next call to +* user_exit() or rcu_irq_enter(). Let's remove RCU's dependency +* on the tick. +*/ rcu_user_enter(); } local_irq_restore(flags); } + +/** + * user_exit - Inform the context tracking that the CPU is + * exiting userspace mode and entering the kernel. + * + * This function must be called after we entered the kernel from userspace + * before any use of RCU read side critical section. This potentially include + * any high level kernel code like syscalls, exceptions, signal handling, etc... + * + * This call supports re-entrancy. This way it can be called from any exception + * handler without needing to know if we came from userspace or not. + */ void user_exit(void) { unsigned long flags; - /* -* Some contexts may involve an exception occuring in an irq, -* leading to that nesting: -* rcu_irq_enter() rcu_user_exit() rcu_user_exit() rcu_irq_exit() -* This would mess up the dyntick_nesting count though. And rcu_irq_*() -* helpers are enough to protect RCU uses inside the exception. So -* just return immediately if we detect we are in an IRQ. -*/ if (in_interrupt()) return; local_irq_save(flags); if (__this_cpu_read(context_tracking.state) == IN_USER) { __this_cpu_write(context_tracking.state, IN_KERNEL); + /* +* We are going to run code that may use RCU. Inform +* RCU core about that (ie: we may need the tick again). +*/ rcu_user_exit(); } local_irq_restore(flags); } + +/** + * context_tracking_task_switch - context switch the syscall callbacks + * + * The context tracking uses the syscall slow path to implement its user-kernel + * boundaries probes on syscalls. This way it doesn't impact the syscall fast + * path on CPUs that don't do context tra
Re: [PATCH] context_tracking: Add comments on interface and internals
2012/12/14 Andrew Morton : > On Thu, 13 Dec 2012 23:50:23 +0100 > Frederic Weisbecker wrote: > >> > >> >> + * This call supports re-entrancy. >> > >> > Presumably the explanation for user_exit() applies here. >> >> Not sure what you mean here. > > It's unclear what it means to say "user_enter() supports reentrancy". > I mean, zillions of kernel functions are surely reentrant - so what? > It appears that you had something in mind when pointing this out, but > what was it? The comment over user_exit() appears to tell us. Ah ok. Yeah indeed, the fact user_exit() is reentrant is very important because I have precise usecases in mind. For user_enter() I don't, so probably I don't need to inform about it. > >> > It's mainly this bit which makes me wonder why the code is in lib/. Is >> > there any conceivable prospect that any other subsystem will use this >> > code for anything? >> >> So that's because of that cputime accounting on dynticks CPUs which >> will need to know about user/kernel transitions. I'm preparing that >> for the 3.9 merge window. > > Oh. That's really the entire reason for the patch and should have been > in the changelog! I mentioned it in the changelog: commit 91d1aa43d30505b0b825db8898ffc80a8eca96c7 "context_tracking: New context tracking susbsystem" """ We need to pull this up from RCU into this new level of indirection because this tracking is also going to be used to implement an "on demand" generic virtual cputime accounting. A necessary step to shutdown the tick while still accounting the cputime. """ Another reason, more implicit this time, was to avoid that RCU handles those reentrancy things and context tracking all around by itself. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] context_tracking: Add comments on interface and internals
On Thu, 13 Dec 2012 23:50:23 +0100 Frederic Weisbecker wrote: > > > >> + * This call supports re-entrancy. > > > > Presumably the explanation for user_exit() applies here. > > Not sure what you mean here. It's unclear what it means to say "user_enter() supports reentrancy". I mean, zillions of kernel functions are surely reentrant - so what? It appears that you had something in mind when pointing this out, but what was it? The comment over user_exit() appears to tell us. > > It's mainly this bit which makes me wonder why the code is in lib/. Is > > there any conceivable prospect that any other subsystem will use this > > code for anything? > > So that's because of that cputime accounting on dynticks CPUs which > will need to know about user/kernel transitions. I'm preparing that > for the 3.9 merge window. Oh. That's really the entire reason for the patch and should have been in the changelog! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] context_tracking: Add comments on interface and internals
2012/12/13 Andrew Morton : > On Thu, 13 Dec 2012 21:57:05 +0100 > Frederic Weisbecker wrote: > >> This subsystem lacks many explanations on its purpose and >> design. Add these missing comments. > > Thanks, it helps. > >> --- a/kernel/context_tracking.c >> +++ b/kernel/context_tracking.c >> @@ -1,3 +1,19 @@ >> +/* >> + * Context tracking: Probe on high level context boundaries such as kernel >> + * and userspace. This includes syscalls and exceptions entry/exit. >> + * >> + * This is used by RCU to remove its dependency to the timer tick while a >> CPU >> + * runs in userspace. > > "on the timer tick" Oops, will fix, along with the other spelling issues you reported. > >> + * >> + * Started by Frederic Weisbecker: >> + * >> + * Copyright (C) 2012 Red Hat, Inc., Frederic Weisbecker >> >> + * >> + * Many thanks to Gilad Ben-Yossef, Paul McKenney, Ingo Molnar, Andrew >> Morton, >> + * Steven Rostedt, Peter Zijlstra for suggestions and improvements. >> + * >> + */ >> + >> #include >> #include >> #include >> @@ -6,8 +22,8 @@ >> >> struct context_tracking { >> /* >> - * When active is false, hooks are not set to >> - * minimize overhead: TIF flags are cleared >> + * When active is false, hooks are unset in order >> + * to minimize overhead: TIF flags are cleared >>* and calls to user_enter/exit are ignored. This >>* may be further optimized using static keys. >>*/ >> @@ -24,6 +40,16 @@ static DEFINE_PER_CPU(struct context_tracking, >> context_tracking) = { >> #endif >> }; >> >> +/** >> + * user_enter - Inform the context tracking that the CPU is going to >> + * enter in userspace mode. > > s/in // > >> + * >> + * This function must be called right before we switch from the kernel >> + * to the user space, when the last remaining kernel instructions to execute > > s/the user space/userspace/ > >> + * are low level arch code that perform the resuming to userspace. > > This is a bit vague - what is "right before"? What happens if this is > done a few instructions early? I mean, what exactly is the requirement > here? Might it be something like "after the last rcu_foo operation"? > > IOW, if the call to user_enter() were moved earlier and earlier, at > what point would the kernel gain a bug? What caused that bug? That's indeed too vague. So as long as RCU is the only user of this, the only rule is: call user_enter() when you're about to resume in userspace and you're sure there will be no use of RCU until we return to the kernel. Here the precision on when to call that wrt. kernel -> user transition step doesn't matter much. This is only about RCU usage correctness. Now this context tracking will soon be used by the cputime subsystem in order to implement a generic tickless cputime accounting. The precision induced by the probe location in kernel/user transition will have an effect on cputime accounting precision. But even there this shouldn't matter much because this will have a per-jiffies granularity. This may evolve in the future with a nanosec granularity but then it will be up to archs to place the probes closer to the real kernel/user boundaries. Anyway, I'll comment on the RCU requirement for now and extend the comments to explain the cputime precision issue when I'll add the cputime bits. > >> + * This call supports re-entrancy. > > Presumably the explanation for user_exit() applies here. Not sure what you mean here. > >> + */ >> void user_enter(void) >> { >> unsigned long flags; >> @@ -39,40 +65,68 @@ void user_enter(void) >> if (in_interrupt()) >> return; >> >> + /* Kernel thread aren't supposed to go to userspace */ > > s/thread/threads/ > >> WARN_ON_ONCE(!current->mm); >> >> local_irq_save(flags); >> if (__this_cpu_read(context_tracking.active) && >> __this_cpu_read(context_tracking.state) != IN_USER) { >> __this_cpu_write(context_tracking.state, IN_USER); >> + /* >> + * At this stage, only low level arch entry code remains and >> + * then we'll run in userspace. We can assume there won't we > > s/we/be/ > >> + * any RCU read-side critical section until the next call to >> + * user_exit() or rcu_irq_enter(). Let's remove RCU's >> dependency >> + * on the tick. >> + */ >> rcu_user_enter(); >> } >> local_irq_restore(flags); >> } >> >> + >> +/** >> + * user_exit - Inform the context tracking that the CPU is >> + * exiting userspace mode and entering the kernel. >> + * >> + * This function must be called right before we run any high level kernel >> + * code (ie: anything that is not low level arch entry code) after we >> entered >> + * the kernel from userspace. > > Also a very vague spec. You're right, as for user_enter(), I'll insist on the RCU and cputime requirements. [...] >> +/** >> + * context_tracking_task_
Re: [PATCH] context_tracking: Add comments on interface and internals
On Thu, 13 Dec 2012 21:57:05 +0100 Frederic Weisbecker wrote: > This subsystem lacks many explanations on its purpose and > design. Add these missing comments. Thanks, it helps. > --- a/kernel/context_tracking.c > +++ b/kernel/context_tracking.c > @@ -1,3 +1,19 @@ > +/* > + * Context tracking: Probe on high level context boundaries such as kernel > + * and userspace. This includes syscalls and exceptions entry/exit. > + * > + * This is used by RCU to remove its dependency to the timer tick while a CPU > + * runs in userspace. "on the timer tick" > + * > + * Started by Frederic Weisbecker: > + * > + * Copyright (C) 2012 Red Hat, Inc., Frederic Weisbecker > > + * > + * Many thanks to Gilad Ben-Yossef, Paul McKenney, Ingo Molnar, Andrew > Morton, > + * Steven Rostedt, Peter Zijlstra for suggestions and improvements. > + * > + */ > + > #include > #include > #include > @@ -6,8 +22,8 @@ > > struct context_tracking { > /* > - * When active is false, hooks are not set to > - * minimize overhead: TIF flags are cleared > + * When active is false, hooks are unset in order > + * to minimize overhead: TIF flags are cleared >* and calls to user_enter/exit are ignored. This >* may be further optimized using static keys. >*/ > @@ -24,6 +40,16 @@ static DEFINE_PER_CPU(struct context_tracking, > context_tracking) = { > #endif > }; > > +/** > + * user_enter - Inform the context tracking that the CPU is going to > + * enter in userspace mode. s/in // > + * > + * This function must be called right before we switch from the kernel > + * to the user space, when the last remaining kernel instructions to execute s/the user space/userspace/ > + * are low level arch code that perform the resuming to userspace. This is a bit vague - what is "right before"? What happens if this is done a few instructions early? I mean, what exactly is the requirement here? Might it be something like "after the last rcu_foo operation"? IOW, if the call to user_enter() were moved earlier and earlier, at what point would the kernel gain a bug? What caused that bug? > + * This call supports re-entrancy. Presumably the explanation for user_exit() applies here. > + */ > void user_enter(void) > { > unsigned long flags; > @@ -39,40 +65,68 @@ void user_enter(void) > if (in_interrupt()) > return; > > + /* Kernel thread aren't supposed to go to userspace */ s/thread/threads/ > WARN_ON_ONCE(!current->mm); > > local_irq_save(flags); > if (__this_cpu_read(context_tracking.active) && > __this_cpu_read(context_tracking.state) != IN_USER) { > __this_cpu_write(context_tracking.state, IN_USER); > + /* > + * At this stage, only low level arch entry code remains and > + * then we'll run in userspace. We can assume there won't we s/we/be/ > + * any RCU read-side critical section until the next call to > + * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency > + * on the tick. > + */ > rcu_user_enter(); > } > local_irq_restore(flags); > } > > + > +/** > + * user_exit - Inform the context tracking that the CPU is > + * exiting userspace mode and entering the kernel. > + * > + * This function must be called right before we run any high level kernel > + * code (ie: anything that is not low level arch entry code) after we entered > + * the kernel from userspace. Also a very vague spec. > + * This call supports re-entrancy. This way it can be called from any > exception > + * handler without bothering to know if we come from userspace or not. s/bothering/needing/ s/come/came/ > + */ > void user_exit(void) > { > unsigned long flags; > > - /* > - * Some contexts may involve an exception occuring in an irq, > - * leading to that nesting: > - * rcu_irq_enter() rcu_user_exit() rcu_user_exit() rcu_irq_exit() > - * This would mess up the dyntick_nesting count though. And rcu_irq_*() > - * helpers are enough to protect RCU uses inside the exception. So > - * just return immediately if we detect we are in an IRQ. > - */ > if (in_interrupt()) > return; > > local_irq_save(flags); > if (__this_cpu_read(context_tracking.state) == IN_USER) { > __this_cpu_write(context_tracking.state, IN_KERNEL); > + /* > + * We are going to run code that may use RCU. Inform > + * RCU core about that (ie: we may need the tick again). > + */ > rcu_user_exit(); > } > local_irq_restore(flags); > } > > + > +/** > + * context_tracking_task_switch - context switch the syscall hooks > + * > + * The context tracking uses the syscall slow path to implement its > user-kernel > + * boundaries hooks on syscalls. This way it
[PATCH] context_tracking: Add comments on interface and internals
This subsystem lacks many explanations on its purpose and design. Add these missing comments. Reported-by: Andrew Morton Signed-off-by: Frederic Weisbecker Cc: Gilad Ben-Yossef Cc: Thomas Gleixner Cc: Andrew Morton Cc: Paul E. McKenney Cc: Ingo Molnar Cc: Steven Rostedt Cc: Peter Zijlstra Cc: Li Zhong --- kernel/context_tracking.c | 74 ++-- 1 files changed, 64 insertions(+), 10 deletions(-) diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index e0e07fd..f146b27 100644 --- a/kernel/context_tracking.c +++ b/kernel/context_tracking.c @@ -1,3 +1,19 @@ +/* + * Context tracking: Probe on high level context boundaries such as kernel + * and userspace. This includes syscalls and exceptions entry/exit. + * + * This is used by RCU to remove its dependency to the timer tick while a CPU + * runs in userspace. + * + * Started by Frederic Weisbecker: + * + * Copyright (C) 2012 Red Hat, Inc., Frederic Weisbecker + * + * Many thanks to Gilad Ben-Yossef, Paul McKenney, Ingo Molnar, Andrew Morton, + * Steven Rostedt, Peter Zijlstra for suggestions and improvements. + * + */ + #include #include #include @@ -6,8 +22,8 @@ struct context_tracking { /* -* When active is false, hooks are not set to -* minimize overhead: TIF flags are cleared +* When active is false, hooks are unset in order +* to minimize overhead: TIF flags are cleared * and calls to user_enter/exit are ignored. This * may be further optimized using static keys. */ @@ -24,6 +40,16 @@ static DEFINE_PER_CPU(struct context_tracking, context_tracking) = { #endif }; +/** + * user_enter - Inform the context tracking that the CPU is going to + * enter in userspace mode. + * + * This function must be called right before we switch from the kernel + * to the user space, when the last remaining kernel instructions to execute + * are low level arch code that perform the resuming to userspace. + * + * This call supports re-entrancy. + */ void user_enter(void) { unsigned long flags; @@ -39,40 +65,68 @@ void user_enter(void) if (in_interrupt()) return; + /* Kernel thread aren't supposed to go to userspace */ WARN_ON_ONCE(!current->mm); local_irq_save(flags); if (__this_cpu_read(context_tracking.active) && __this_cpu_read(context_tracking.state) != IN_USER) { __this_cpu_write(context_tracking.state, IN_USER); + /* +* At this stage, only low level arch entry code remains and +* then we'll run in userspace. We can assume there won't we +* any RCU read-side critical section until the next call to +* user_exit() or rcu_irq_enter(). Let's remove RCU's dependency +* on the tick. +*/ rcu_user_enter(); } local_irq_restore(flags); } + +/** + * user_exit - Inform the context tracking that the CPU is + * exiting userspace mode and entering the kernel. + * + * This function must be called right before we run any high level kernel + * code (ie: anything that is not low level arch entry code) after we entered + * the kernel from userspace. + * + * This call supports re-entrancy. This way it can be called from any exception + * handler without bothering to know if we come from userspace or not. + */ void user_exit(void) { unsigned long flags; - /* -* Some contexts may involve an exception occuring in an irq, -* leading to that nesting: -* rcu_irq_enter() rcu_user_exit() rcu_user_exit() rcu_irq_exit() -* This would mess up the dyntick_nesting count though. And rcu_irq_*() -* helpers are enough to protect RCU uses inside the exception. So -* just return immediately if we detect we are in an IRQ. -*/ if (in_interrupt()) return; local_irq_save(flags); if (__this_cpu_read(context_tracking.state) == IN_USER) { __this_cpu_write(context_tracking.state, IN_KERNEL); + /* +* We are going to run code that may use RCU. Inform +* RCU core about that (ie: we may need the tick again). +*/ rcu_user_exit(); } local_irq_restore(flags); } + +/** + * context_tracking_task_switch - context switch the syscall hooks + * + * The context tracking uses the syscall slow path to implement its user-kernel + * boundaries hooks on syscalls. This way it doesn't impact the syscall fast + * path on CPUs that don't do context tracking. + * + * But we need to clear the flag on the previous task because it may later + * migrate to some CPU that doesn't do the context tracking. As such the TIF + * flag may not be desired there. + */ void context_tracking_task_switch(struct task_struct *prev