Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 8, 2018 6:47:16 PM EDT, "Paul E. McKenney" wrote: >On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote: >> On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney >> wrote: >> [...] >> >> >> >> It does start to seem like a show stopper :-( >> >> >> > >> >> >> > I suppose that an srcu_read_lock_nmi() and >srcu_read_unlock_nmi() could >> >> >> > be added, which would do atomic ops on >sp->sda->srcu_lock_count. Not sure >> >> >> > whether this would be fast enough to be useful, but easy to >provide: >> >> >> > >> >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* >UNTESTED. */ >> >> >> > { >> >> >> > int idx; >> >> >> > >> >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; >> >> >> > atomic_inc(>sda->srcu_lock_count[idx]); >> >> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking >critical section. */ >> >> >> > return idx; >> >> >> > } >> >> >> > >> >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) >> >> >> > { >> >> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking >critical section. */ >> >> >> > atomic_inc(>sda->srcu_unlock_count[idx]); >> >> >> > } >> >> >> > >> >> >> > With appropriate adjustments to also allow Tiny RCU to also >work. >> >> >> > >> >> >> > Note that you have to use _nmi() everywhere, not just in NMI >handlers. >> >> >> > In fact, the NMI handlers are the one place you -don't- need >to use >> >> >> > _nmi(), strangely enough. >> >> >> > >> >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a >no-op on >> >> >> > some architectures, for example. >> >> >> >> >> >> Continuing Steve's question on regular interrupts, do we need >to use >> >> >> this atomic_inc API for regular interrupts as well? So I guess >> >> > >> >> > If NMIs use one srcu_struct and non-NMI uses another, the >current >> >> > srcu_read_lock() and srcu_read_unlock() will work just fine. If >any given >> >> > srcu_struct needs both NMI and non-NMI readers, then we really >do need >> >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that >srcu_struct. >> >> >> >> Yes, I believe as long as in_nmi() works reliably, we can use the >> >> right srcu_struct (NMI vs non-NMI) and it would be fine. >> >> >> >> Going through this thread, it sounds though that this_cpu_inc may >not >> >> be reliable on all architectures even for non-NMI interrupts and >> >> local_inc may be the way to go. >> > >> > My understanding is that this_cpu_inc() is defined to handle >interrupts, >> > so any architecture on which it is unreliable needs to fix its bug. > ;-) >> >> Yes that's my understanding as well. >> >> Then may be I'm missing something about yours/Steve's conversations >in >> the morning, about why we need bother with the local_inc then. So the >> current SRCU code with the separate NMI handle should work fine (for >> future merge windows) as long as we're using a separate srcu_struct >> for NMI. :-) > >I do believe that to be true. But only as long as that separate >srcu_struct is -only- used for NMI. > >If this is what you have been pushing for all along, please accept my >apologies for my being slow! > That's ok, sorry I initially didn't describe it well which may have caused confusion, but yes that's what I was pushing for. >That said, your approach does require you to have a perfect way to >distinguish between NMI and not-NMI. If the distinguishing is even >in the slightest imperfect, then some sort of NMI-safe SRCU reader >approach is of course required. > Thanks Paul, agreed with everything and we are on the same page. - Joel > Thanx, Paul > >> >> For next merge window (not this one), lets do that then? Paul, if >you >> >> could provide me an SRCU API that uses local_inc, then I believe >that >> >> coupled with this patch should be all that's needed: >> >> https://lore.kernel.org/patchwork/patch/972657/ >> >> >> >> Steve did express concern though if in_nmi() works reliably (i.e. >> >> tracepoint doesn't fire from "thunk" code before in_nmi() is >> >> available). Any thoughts on that Steve? >> > >> > Agreed, not the upcoming merge window. But we do need to work out >> > exactly what is the right way to do this. >> >> Agreed, thanks! >> >> - Joel >> -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 8, 2018 6:47:16 PM EDT, "Paul E. McKenney" wrote: >On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote: >> On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney >> wrote: >> [...] >> >> >> >> It does start to seem like a show stopper :-( >> >> >> > >> >> >> > I suppose that an srcu_read_lock_nmi() and >srcu_read_unlock_nmi() could >> >> >> > be added, which would do atomic ops on >sp->sda->srcu_lock_count. Not sure >> >> >> > whether this would be fast enough to be useful, but easy to >provide: >> >> >> > >> >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* >UNTESTED. */ >> >> >> > { >> >> >> > int idx; >> >> >> > >> >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; >> >> >> > atomic_inc(>sda->srcu_lock_count[idx]); >> >> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking >critical section. */ >> >> >> > return idx; >> >> >> > } >> >> >> > >> >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) >> >> >> > { >> >> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking >critical section. */ >> >> >> > atomic_inc(>sda->srcu_unlock_count[idx]); >> >> >> > } >> >> >> > >> >> >> > With appropriate adjustments to also allow Tiny RCU to also >work. >> >> >> > >> >> >> > Note that you have to use _nmi() everywhere, not just in NMI >handlers. >> >> >> > In fact, the NMI handlers are the one place you -don't- need >to use >> >> >> > _nmi(), strangely enough. >> >> >> > >> >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a >no-op on >> >> >> > some architectures, for example. >> >> >> >> >> >> Continuing Steve's question on regular interrupts, do we need >to use >> >> >> this atomic_inc API for regular interrupts as well? So I guess >> >> > >> >> > If NMIs use one srcu_struct and non-NMI uses another, the >current >> >> > srcu_read_lock() and srcu_read_unlock() will work just fine. If >any given >> >> > srcu_struct needs both NMI and non-NMI readers, then we really >do need >> >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that >srcu_struct. >> >> >> >> Yes, I believe as long as in_nmi() works reliably, we can use the >> >> right srcu_struct (NMI vs non-NMI) and it would be fine. >> >> >> >> Going through this thread, it sounds though that this_cpu_inc may >not >> >> be reliable on all architectures even for non-NMI interrupts and >> >> local_inc may be the way to go. >> > >> > My understanding is that this_cpu_inc() is defined to handle >interrupts, >> > so any architecture on which it is unreliable needs to fix its bug. > ;-) >> >> Yes that's my understanding as well. >> >> Then may be I'm missing something about yours/Steve's conversations >in >> the morning, about why we need bother with the local_inc then. So the >> current SRCU code with the separate NMI handle should work fine (for >> future merge windows) as long as we're using a separate srcu_struct >> for NMI. :-) > >I do believe that to be true. But only as long as that separate >srcu_struct is -only- used for NMI. > >If this is what you have been pushing for all along, please accept my >apologies for my being slow! > That's ok, sorry I initially didn't describe it well which may have caused confusion, but yes that's what I was pushing for. >That said, your approach does require you to have a perfect way to >distinguish between NMI and not-NMI. If the distinguishing is even >in the slightest imperfect, then some sort of NMI-safe SRCU reader >approach is of course required. > Thanks Paul, agreed with everything and we are on the same page. - Joel > Thanx, Paul > >> >> For next merge window (not this one), lets do that then? Paul, if >you >> >> could provide me an SRCU API that uses local_inc, then I believe >that >> >> coupled with this patch should be all that's needed: >> >> https://lore.kernel.org/patchwork/patch/972657/ >> >> >> >> Steve did express concern though if in_nmi() works reliably (i.e. >> >> tracepoint doesn't fire from "thunk" code before in_nmi() is >> >> available). Any thoughts on that Steve? >> > >> > Agreed, not the upcoming merge window. But we do need to work out >> > exactly what is the right way to do this. >> >> Agreed, thanks! >> >> - Joel >> -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote: > On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney > wrote: > [...] > >> >> >> It does start to seem like a show stopper :-( > >> >> > > >> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() > >> >> > could > >> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not > >> >> > sure > >> >> > whether this would be fast enough to be useful, but easy to provide: > >> >> > > >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > >> >> > { > >> >> > int idx; > >> >> > > >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; > >> >> > atomic_inc(>sda->srcu_lock_count[idx]); > >> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical > >> >> > section. */ > >> >> > return idx; > >> >> > } > >> >> > > >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > >> >> > { > >> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical > >> >> > section. */ > >> >> > atomic_inc(>sda->srcu_unlock_count[idx]); > >> >> > } > >> >> > > >> >> > With appropriate adjustments to also allow Tiny RCU to also work. > >> >> > > >> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. > >> >> > In fact, the NMI handlers are the one place you -don't- need to use > >> >> > _nmi(), strangely enough. > >> >> > > >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > >> >> > some architectures, for example. > >> >> > >> >> Continuing Steve's question on regular interrupts, do we need to use > >> >> this atomic_inc API for regular interrupts as well? So I guess > >> > > >> > If NMIs use one srcu_struct and non-NMI uses another, the current > >> > srcu_read_lock() and srcu_read_unlock() will work just fine. If any > >> > given > >> > srcu_struct needs both NMI and non-NMI readers, then we really do need > >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. > >> > >> Yes, I believe as long as in_nmi() works reliably, we can use the > >> right srcu_struct (NMI vs non-NMI) and it would be fine. > >> > >> Going through this thread, it sounds though that this_cpu_inc may not > >> be reliable on all architectures even for non-NMI interrupts and > >> local_inc may be the way to go. > > > > My understanding is that this_cpu_inc() is defined to handle interrupts, > > so any architecture on which it is unreliable needs to fix its bug. ;-) > > Yes that's my understanding as well. > > Then may be I'm missing something about yours/Steve's conversations in > the morning, about why we need bother with the local_inc then. So the > current SRCU code with the separate NMI handle should work fine (for > future merge windows) as long as we're using a separate srcu_struct > for NMI. :-) I do believe that to be true. But only as long as that separate srcu_struct is -only- used for NMI. If this is what you have been pushing for all along, please accept my apologies for my being slow! That said, your approach does require you to have a perfect way to distinguish between NMI and not-NMI. If the distinguishing is even in the slightest imperfect, then some sort of NMI-safe SRCU reader approach is of course required. Thanx, Paul > >> For next merge window (not this one), lets do that then? Paul, if you > >> could provide me an SRCU API that uses local_inc, then I believe that > >> coupled with this patch should be all that's needed: > >> https://lore.kernel.org/patchwork/patch/972657/ > >> > >> Steve did express concern though if in_nmi() works reliably (i.e. > >> tracepoint doesn't fire from "thunk" code before in_nmi() is > >> available). Any thoughts on that Steve? > > > > Agreed, not the upcoming merge window. But we do need to work out > > exactly what is the right way to do this. > > Agreed, thanks! > > - Joel >
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 03:15:31PM -0700, Joel Fernandes wrote: > On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney > wrote: > [...] > >> >> >> It does start to seem like a show stopper :-( > >> >> > > >> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() > >> >> > could > >> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not > >> >> > sure > >> >> > whether this would be fast enough to be useful, but easy to provide: > >> >> > > >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > >> >> > { > >> >> > int idx; > >> >> > > >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; > >> >> > atomic_inc(>sda->srcu_lock_count[idx]); > >> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical > >> >> > section. */ > >> >> > return idx; > >> >> > } > >> >> > > >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > >> >> > { > >> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical > >> >> > section. */ > >> >> > atomic_inc(>sda->srcu_unlock_count[idx]); > >> >> > } > >> >> > > >> >> > With appropriate adjustments to also allow Tiny RCU to also work. > >> >> > > >> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. > >> >> > In fact, the NMI handlers are the one place you -don't- need to use > >> >> > _nmi(), strangely enough. > >> >> > > >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > >> >> > some architectures, for example. > >> >> > >> >> Continuing Steve's question on regular interrupts, do we need to use > >> >> this atomic_inc API for regular interrupts as well? So I guess > >> > > >> > If NMIs use one srcu_struct and non-NMI uses another, the current > >> > srcu_read_lock() and srcu_read_unlock() will work just fine. If any > >> > given > >> > srcu_struct needs both NMI and non-NMI readers, then we really do need > >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. > >> > >> Yes, I believe as long as in_nmi() works reliably, we can use the > >> right srcu_struct (NMI vs non-NMI) and it would be fine. > >> > >> Going through this thread, it sounds though that this_cpu_inc may not > >> be reliable on all architectures even for non-NMI interrupts and > >> local_inc may be the way to go. > > > > My understanding is that this_cpu_inc() is defined to handle interrupts, > > so any architecture on which it is unreliable needs to fix its bug. ;-) > > Yes that's my understanding as well. > > Then may be I'm missing something about yours/Steve's conversations in > the morning, about why we need bother with the local_inc then. So the > current SRCU code with the separate NMI handle should work fine (for > future merge windows) as long as we're using a separate srcu_struct > for NMI. :-) I do believe that to be true. But only as long as that separate srcu_struct is -only- used for NMI. If this is what you have been pushing for all along, please accept my apologies for my being slow! That said, your approach does require you to have a perfect way to distinguish between NMI and not-NMI. If the distinguishing is even in the slightest imperfect, then some sort of NMI-safe SRCU reader approach is of course required. Thanx, Paul > >> For next merge window (not this one), lets do that then? Paul, if you > >> could provide me an SRCU API that uses local_inc, then I believe that > >> coupled with this patch should be all that's needed: > >> https://lore.kernel.org/patchwork/patch/972657/ > >> > >> Steve did express concern though if in_nmi() works reliably (i.e. > >> tracepoint doesn't fire from "thunk" code before in_nmi() is > >> available). Any thoughts on that Steve? > > > > Agreed, not the upcoming merge window. But we do need to work out > > exactly what is the right way to do this. > > Agreed, thanks! > > - Joel >
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney wrote: [...] >> >> >> It does start to seem like a show stopper :-( >> >> > >> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could >> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not >> >> > sure >> >> > whether this would be fast enough to be useful, but easy to provide: >> >> > >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ >> >> > { >> >> > int idx; >> >> > >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; >> >> > atomic_inc(>sda->srcu_lock_count[idx]); >> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical >> >> > section. */ >> >> > return idx; >> >> > } >> >> > >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) >> >> > { >> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical >> >> > section. */ >> >> > atomic_inc(>sda->srcu_unlock_count[idx]); >> >> > } >> >> > >> >> > With appropriate adjustments to also allow Tiny RCU to also work. >> >> > >> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. >> >> > In fact, the NMI handlers are the one place you -don't- need to use >> >> > _nmi(), strangely enough. >> >> > >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on >> >> > some architectures, for example. >> >> >> >> Continuing Steve's question on regular interrupts, do we need to use >> >> this atomic_inc API for regular interrupts as well? So I guess >> > >> > If NMIs use one srcu_struct and non-NMI uses another, the current >> > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given >> > srcu_struct needs both NMI and non-NMI readers, then we really do need >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. >> >> Yes, I believe as long as in_nmi() works reliably, we can use the >> right srcu_struct (NMI vs non-NMI) and it would be fine. >> >> Going through this thread, it sounds though that this_cpu_inc may not >> be reliable on all architectures even for non-NMI interrupts and >> local_inc may be the way to go. > > My understanding is that this_cpu_inc() is defined to handle interrupts, > so any architecture on which it is unreliable needs to fix its bug. ;-) Yes that's my understanding as well. Then may be I'm missing something about yours/Steve's conversations in the morning, about why we need bother with the local_inc then. So the current SRCU code with the separate NMI handle should work fine (for future merge windows) as long as we're using a separate srcu_struct for NMI. :-) > >> For next merge window (not this one), lets do that then? Paul, if you >> could provide me an SRCU API that uses local_inc, then I believe that >> coupled with this patch should be all that's needed: >> https://lore.kernel.org/patchwork/patch/972657/ >> >> Steve did express concern though if in_nmi() works reliably (i.e. >> tracepoint doesn't fire from "thunk" code before in_nmi() is >> available). Any thoughts on that Steve? > > Agreed, not the upcoming merge window. But we do need to work out > exactly what is the right way to do this. Agreed, thanks! - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 8, 2018 at 1:18 PM, Paul E. McKenney wrote: [...] >> >> >> It does start to seem like a show stopper :-( >> >> > >> >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could >> >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not >> >> > sure >> >> > whether this would be fast enough to be useful, but easy to provide: >> >> > >> >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ >> >> > { >> >> > int idx; >> >> > >> >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; >> >> > atomic_inc(>sda->srcu_lock_count[idx]); >> >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical >> >> > section. */ >> >> > return idx; >> >> > } >> >> > >> >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) >> >> > { >> >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical >> >> > section. */ >> >> > atomic_inc(>sda->srcu_unlock_count[idx]); >> >> > } >> >> > >> >> > With appropriate adjustments to also allow Tiny RCU to also work. >> >> > >> >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. >> >> > In fact, the NMI handlers are the one place you -don't- need to use >> >> > _nmi(), strangely enough. >> >> > >> >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on >> >> > some architectures, for example. >> >> >> >> Continuing Steve's question on regular interrupts, do we need to use >> >> this atomic_inc API for regular interrupts as well? So I guess >> > >> > If NMIs use one srcu_struct and non-NMI uses another, the current >> > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given >> > srcu_struct needs both NMI and non-NMI readers, then we really do need >> > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. >> >> Yes, I believe as long as in_nmi() works reliably, we can use the >> right srcu_struct (NMI vs non-NMI) and it would be fine. >> >> Going through this thread, it sounds though that this_cpu_inc may not >> be reliable on all architectures even for non-NMI interrupts and >> local_inc may be the way to go. > > My understanding is that this_cpu_inc() is defined to handle interrupts, > so any architecture on which it is unreliable needs to fix its bug. ;-) Yes that's my understanding as well. Then may be I'm missing something about yours/Steve's conversations in the morning, about why we need bother with the local_inc then. So the current SRCU code with the separate NMI handle should work fine (for future merge windows) as long as we're using a separate srcu_struct for NMI. :-) > >> For next merge window (not this one), lets do that then? Paul, if you >> could provide me an SRCU API that uses local_inc, then I believe that >> coupled with this patch should be all that's needed: >> https://lore.kernel.org/patchwork/patch/972657/ >> >> Steve did express concern though if in_nmi() works reliably (i.e. >> tracepoint doesn't fire from "thunk" code before in_nmi() is >> available). Any thoughts on that Steve? > > Agreed, not the upcoming merge window. But we do need to work out > exactly what is the right way to do this. Agreed, thanks! - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 12:24:20PM -0700, Joel Fernandes wrote: > On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney > wrote: > [...] > > > >> In that case based on what you're saying, the patch I sent to using > >> different srcu_struct for NMI is still good I guess... > > > > As long as you wait for both SRCU grace periods. Hmmm... Maybe that means > > that there is still a use for synchronize_rcu_mult(): > > > > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func) > > { > > call_srcu(_srcu_struct_nmi, rhp, func); > > } > > > > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func) > > { > > call_srcu(_srcu_struct_nonmi, rhp, func); > > } > > > > ... > > > > /* Wait concurrently on the two grace periods. */ > > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi); > > > > On the other hand, I bet that doing this is just fine in your use case: > > > > synchronize_srcu(_srcu_struct_nmi); > > synchronize_srcu(_srcu_struct_nonmi); > > > > But please note that synchronize_rcu_mult() is no more in my -rcu tree, > > so if you do want it please let me know (and please let me know why it > > is important). > > I did the chaining thing (one callback calling another), that should > work too right? I believe that is needed so that the tracepoint > callbacks are freed at one point and only when both NMI and non-NMI > read sections have completed. Yes, that works also. It is possible to make that happen concurrently via atomic_dec_and_test() or similar, but if the latency is not a problem, why bother? > >> >> It does start to seem like a show stopper :-( > >> > > >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not > >> > sure > >> > whether this would be fast enough to be useful, but easy to provide: > >> > > >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > >> > { > >> > int idx; > >> > > >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; > >> > atomic_inc(>sda->srcu_lock_count[idx]); > >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical > >> > section. */ > >> > return idx; > >> > } > >> > > >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > >> > { > >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical > >> > section. */ > >> > atomic_inc(>sda->srcu_unlock_count[idx]); > >> > } > >> > > >> > With appropriate adjustments to also allow Tiny RCU to also work. > >> > > >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. > >> > In fact, the NMI handlers are the one place you -don't- need to use > >> > _nmi(), strangely enough. > >> > > >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > >> > some architectures, for example. > >> > >> Continuing Steve's question on regular interrupts, do we need to use > >> this atomic_inc API for regular interrupts as well? So I guess > > > > If NMIs use one srcu_struct and non-NMI uses another, the current > > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given > > srcu_struct needs both NMI and non-NMI readers, then we really do need > > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. > > Yes, I believe as long as in_nmi() works reliably, we can use the > right srcu_struct (NMI vs non-NMI) and it would be fine. > > Going through this thread, it sounds though that this_cpu_inc may not > be reliable on all architectures even for non-NMI interrupts and > local_inc may be the way to go. My understanding is that this_cpu_inc() is defined to handle interrupts, so any architecture on which it is unreliable needs to fix its bug. ;-) > For next merge window (not this one), lets do that then? Paul, if you > could provide me an SRCU API that uses local_inc, then I believe that > coupled with this patch should be all that's needed: > https://lore.kernel.org/patchwork/patch/972657/ > > Steve did express concern though if in_nmi() works reliably (i.e. > tracepoint doesn't fire from "thunk" code before in_nmi() is > available). Any thoughts on that Steve? Agreed, not the upcoming merge window. But we do need to work out exactly what is the right way to do this. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 12:24:20PM -0700, Joel Fernandes wrote: > On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney > wrote: > [...] > > > >> In that case based on what you're saying, the patch I sent to using > >> different srcu_struct for NMI is still good I guess... > > > > As long as you wait for both SRCU grace periods. Hmmm... Maybe that means > > that there is still a use for synchronize_rcu_mult(): > > > > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func) > > { > > call_srcu(_srcu_struct_nmi, rhp, func); > > } > > > > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func) > > { > > call_srcu(_srcu_struct_nonmi, rhp, func); > > } > > > > ... > > > > /* Wait concurrently on the two grace periods. */ > > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi); > > > > On the other hand, I bet that doing this is just fine in your use case: > > > > synchronize_srcu(_srcu_struct_nmi); > > synchronize_srcu(_srcu_struct_nonmi); > > > > But please note that synchronize_rcu_mult() is no more in my -rcu tree, > > so if you do want it please let me know (and please let me know why it > > is important). > > I did the chaining thing (one callback calling another), that should > work too right? I believe that is needed so that the tracepoint > callbacks are freed at one point and only when both NMI and non-NMI > read sections have completed. Yes, that works also. It is possible to make that happen concurrently via atomic_dec_and_test() or similar, but if the latency is not a problem, why bother? > >> >> It does start to seem like a show stopper :-( > >> > > >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not > >> > sure > >> > whether this would be fast enough to be useful, but easy to provide: > >> > > >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > >> > { > >> > int idx; > >> > > >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; > >> > atomic_inc(>sda->srcu_lock_count[idx]); > >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical > >> > section. */ > >> > return idx; > >> > } > >> > > >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > >> > { > >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical > >> > section. */ > >> > atomic_inc(>sda->srcu_unlock_count[idx]); > >> > } > >> > > >> > With appropriate adjustments to also allow Tiny RCU to also work. > >> > > >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. > >> > In fact, the NMI handlers are the one place you -don't- need to use > >> > _nmi(), strangely enough. > >> > > >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > >> > some architectures, for example. > >> > >> Continuing Steve's question on regular interrupts, do we need to use > >> this atomic_inc API for regular interrupts as well? So I guess > > > > If NMIs use one srcu_struct and non-NMI uses another, the current > > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given > > srcu_struct needs both NMI and non-NMI readers, then we really do need > > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. > > Yes, I believe as long as in_nmi() works reliably, we can use the > right srcu_struct (NMI vs non-NMI) and it would be fine. > > Going through this thread, it sounds though that this_cpu_inc may not > be reliable on all architectures even for non-NMI interrupts and > local_inc may be the way to go. My understanding is that this_cpu_inc() is defined to handle interrupts, so any architecture on which it is unreliable needs to fix its bug. ;-) > For next merge window (not this one), lets do that then? Paul, if you > could provide me an SRCU API that uses local_inc, then I believe that > coupled with this patch should be all that's needed: > https://lore.kernel.org/patchwork/patch/972657/ > > Steve did express concern though if in_nmi() works reliably (i.e. > tracepoint doesn't fire from "thunk" code before in_nmi() is > available). Any thoughts on that Steve? Agreed, not the upcoming merge window. But we do need to work out exactly what is the right way to do this. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney wrote: [...] > >> In that case based on what you're saying, the patch I sent to using >> different srcu_struct for NMI is still good I guess... > > As long as you wait for both SRCU grace periods. Hmmm... Maybe that means > that there is still a use for synchronize_rcu_mult(): > > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func) > { > call_srcu(_srcu_struct_nmi, rhp, func); > } > > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func) > { > call_srcu(_srcu_struct_nonmi, rhp, func); > } > > ... > > /* Wait concurrently on the two grace periods. */ > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi); > > On the other hand, I bet that doing this is just fine in your use case: > > synchronize_srcu(_srcu_struct_nmi); > synchronize_srcu(_srcu_struct_nonmi); > > But please note that synchronize_rcu_mult() is no more in my -rcu tree, > so if you do want it please let me know (and please let me know why it > is important). I did the chaining thing (one callback calling another), that should work too right? I believe that is needed so that the tracepoint callbacks are freed at one point and only when both NMI and non-NMI read sections have completed. >> >> It does start to seem like a show stopper :-( >> > >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure >> > whether this would be fast enough to be useful, but easy to provide: >> > >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ >> > { >> > int idx; >> > >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; >> > atomic_inc(>sda->srcu_lock_count[idx]); >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical >> > section. */ >> > return idx; >> > } >> > >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) >> > { >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical >> > section. */ >> > atomic_inc(>sda->srcu_unlock_count[idx]); >> > } >> > >> > With appropriate adjustments to also allow Tiny RCU to also work. >> > >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. >> > In fact, the NMI handlers are the one place you -don't- need to use >> > _nmi(), strangely enough. >> > >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on >> > some architectures, for example. >> >> Continuing Steve's question on regular interrupts, do we need to use >> this atomic_inc API for regular interrupts as well? So I guess > > If NMIs use one srcu_struct and non-NMI uses another, the current > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given > srcu_struct needs both NMI and non-NMI readers, then we really do need > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. Yes, I believe as long as in_nmi() works reliably, we can use the right srcu_struct (NMI vs non-NMI) and it would be fine. Going through this thread, it sounds though that this_cpu_inc may not be reliable on all architectures even for non-NMI interrupts and local_inc may be the way to go. For next merge window (not this one), lets do that then? Paul, if you could provide me an SRCU API that uses local_inc, then I believe that coupled with this patch should be all that's needed: https://lore.kernel.org/patchwork/patch/972657/ Steve did express concern though if in_nmi() works reliably (i.e. tracepoint doesn't fire from "thunk" code before in_nmi() is available). Any thoughts on that Steve? thanks! - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 8, 2018 at 7:49 AM, Paul E. McKenney wrote: [...] > >> In that case based on what you're saying, the patch I sent to using >> different srcu_struct for NMI is still good I guess... > > As long as you wait for both SRCU grace periods. Hmmm... Maybe that means > that there is still a use for synchronize_rcu_mult(): > > void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func) > { > call_srcu(_srcu_struct_nmi, rhp, func); > } > > void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func) > { > call_srcu(_srcu_struct_nonmi, rhp, func); > } > > ... > > /* Wait concurrently on the two grace periods. */ > synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi); > > On the other hand, I bet that doing this is just fine in your use case: > > synchronize_srcu(_srcu_struct_nmi); > synchronize_srcu(_srcu_struct_nonmi); > > But please note that synchronize_rcu_mult() is no more in my -rcu tree, > so if you do want it please let me know (and please let me know why it > is important). I did the chaining thing (one callback calling another), that should work too right? I believe that is needed so that the tracepoint callbacks are freed at one point and only when both NMI and non-NMI read sections have completed. >> >> It does start to seem like a show stopper :-( >> > >> > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could >> > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure >> > whether this would be fast enough to be useful, but easy to provide: >> > >> > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ >> > { >> > int idx; >> > >> > idx = READ_ONCE(sp->srcu_idx) & 0x1; >> > atomic_inc(>sda->srcu_lock_count[idx]); >> > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical >> > section. */ >> > return idx; >> > } >> > >> > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) >> > { >> > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical >> > section. */ >> > atomic_inc(>sda->srcu_unlock_count[idx]); >> > } >> > >> > With appropriate adjustments to also allow Tiny RCU to also work. >> > >> > Note that you have to use _nmi() everywhere, not just in NMI handlers. >> > In fact, the NMI handlers are the one place you -don't- need to use >> > _nmi(), strangely enough. >> > >> > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on >> > some architectures, for example. >> >> Continuing Steve's question on regular interrupts, do we need to use >> this atomic_inc API for regular interrupts as well? So I guess > > If NMIs use one srcu_struct and non-NMI uses another, the current > srcu_read_lock() and srcu_read_unlock() will work just fine. If any given > srcu_struct needs both NMI and non-NMI readers, then we really do need > __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. Yes, I believe as long as in_nmi() works reliably, we can use the right srcu_struct (NMI vs non-NMI) and it would be fine. Going through this thread, it sounds though that this_cpu_inc may not be reliable on all architectures even for non-NMI interrupts and local_inc may be the way to go. For next merge window (not this one), lets do that then? Paul, if you could provide me an SRCU API that uses local_inc, then I believe that coupled with this patch should be all that's needed: https://lore.kernel.org/patchwork/patch/972657/ Steve did express concern though if in_nmi() works reliably (i.e. tracepoint doesn't fire from "thunk" code before in_nmi() is available). Any thoughts on that Steve? thanks! - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 12:24:04PM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 09:02:43 -0700 > "Paul E. McKenney" wrote: > > > > Which leaves us with sparc, arm, mips, sh and powerpc. > > > > > > sh is almost dead, and powerpc can be fixed, which I guess leaves us > > > with sparc, arm and mips. > > > > If we want to stick with the current srcu_read_lock() and > > srcu_read_unlock(), > > you mean? I would like that sort of outcome, at least assuming we are not > > hammering any of the architectures. > > I would go with the local_inc approach, and even add a > srcu_read_un/lock_nmi() that does that if you want. Probably should add > lockdep to detect if the _nmi calls is ever used along with non _nmi > calls and complain about that. Would it be reasonable to also add a check for non-_nmi calls being used in both NMI and non-NMI contexts? > But this will be something for the next merge window, not the one > coming up. Completely agreed! Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 12:24:04PM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 09:02:43 -0700 > "Paul E. McKenney" wrote: > > > > Which leaves us with sparc, arm, mips, sh and powerpc. > > > > > > sh is almost dead, and powerpc can be fixed, which I guess leaves us > > > with sparc, arm and mips. > > > > If we want to stick with the current srcu_read_lock() and > > srcu_read_unlock(), > > you mean? I would like that sort of outcome, at least assuming we are not > > hammering any of the architectures. > > I would go with the local_inc approach, and even add a > srcu_read_un/lock_nmi() that does that if you want. Probably should add > lockdep to detect if the _nmi calls is ever used along with non _nmi > calls and complain about that. Would it be reasonable to also add a check for non-_nmi calls being used in both NMI and non-NMI contexts? > But this will be something for the next merge window, not the one > coming up. Completely agreed! Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 09:02:43 -0700 "Paul E. McKenney" wrote: > > Which leaves us with sparc, arm, mips, sh and powerpc. > > > > sh is almost dead, and powerpc can be fixed, which I guess leaves us > > with sparc, arm and mips. > > If we want to stick with the current srcu_read_lock() and srcu_read_unlock(), > you mean? I would like that sort of outcome, at least assuming we are not > hammering any of the architectures. I would go with the local_inc approach, and even add a srcu_read_un/lock_nmi() that does that if you want. Probably should add lockdep to detect if the _nmi calls is ever used along with non _nmi calls and complain about that. But this will be something for the next merge window, not the one coming up. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 09:02:43 -0700 "Paul E. McKenney" wrote: > > Which leaves us with sparc, arm, mips, sh and powerpc. > > > > sh is almost dead, and powerpc can be fixed, which I guess leaves us > > with sparc, arm and mips. > > If we want to stick with the current srcu_read_lock() and srcu_read_unlock(), > you mean? I would like that sort of outcome, at least assuming we are not > hammering any of the architectures. I would go with the local_inc approach, and even add a srcu_read_un/lock_nmi() that does that if you want. Probably should add lockdep to detect if the _nmi calls is ever used along with non _nmi calls and complain about that. But this will be something for the next merge window, not the one coming up. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 11:27:05AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 07:42:00 -0700 > "Paul E. McKenney" wrote: > > > > There's also a local_inc() if you are using per cpu pointers, that is > > > suppose to guarantee atomicity for single cpu operations. That's what > > > the ftrace ring buffer uses. > > > > Good point, that becomes atomic_long_inc() or equivalent on most > > architectures, but an incl instruction (not locked) on x86. So updating > > my earlier still-untested thought: > > > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > > { > > int idx; > > > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > > local_inc(>sda->srcu_lock_count[idx]); > > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > > return idx; > > } > > > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > > { > > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > > local_inc(>sda->srcu_unlock_count[idx]); > > } > > > > Would that work, or is there a better way to handle this? > > This would work much better than using the atomic ops, and I think it > would be doable. OK, here is hoping! > I may need to revert the srcu for trace_*_rcidle() for now, as I want > most of the other changes in this merge window, and it's getting too > late to do it with these updates. Agreed, especially since I normally freeze for the next merge window shortly after -rc5. ;-) Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 11:27:05AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 07:42:00 -0700 > "Paul E. McKenney" wrote: > > > > There's also a local_inc() if you are using per cpu pointers, that is > > > suppose to guarantee atomicity for single cpu operations. That's what > > > the ftrace ring buffer uses. > > > > Good point, that becomes atomic_long_inc() or equivalent on most > > architectures, but an incl instruction (not locked) on x86. So updating > > my earlier still-untested thought: > > > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > > { > > int idx; > > > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > > local_inc(>sda->srcu_lock_count[idx]); > > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > > return idx; > > } > > > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > > { > > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > > local_inc(>sda->srcu_unlock_count[idx]); > > } > > > > Would that work, or is there a better way to handle this? > > This would work much better than using the atomic ops, and I think it > would be doable. OK, here is hoping! > I may need to revert the srcu for trace_*_rcidle() for now, as I want > most of the other changes in this merge window, and it's getting too > late to do it with these updates. Agreed, especially since I normally freeze for the next merge window shortly after -rc5. ;-) Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 11:23:09AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 08:05:58 -0700 > "Paul E. McKenney" wrote: > > > On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote: > > > On Wed, 8 Aug 2018 07:33:10 -0700 > > > "Paul E. McKenney" wrote: > > > > > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > > > > On Wed, 8 Aug 2018 06:03:02 -0700 > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. > > > > > > Although > > > > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > > > > interrupts being disabled, thus an NMI is no different than any > > > > > > > interrupt doing the same thing, right? > > > > > > > > > > > > On architectures without increment-memory instructions, if you take > > > > > > an NMI > > > > > > between the load from sp->sda->srcu_lock_count and the later store, > > > > > > you > > > > > > lose a count. Note that both __srcu_read_lock() and > > > > > > __srcu_read_unlock() > > > > > > do increments of different locations, so you cannot rely on the > > > > > > usual > > > > > > "NMI fixes up before exit" semantics you get when incrementing and > > > > > > decrementing the same location. > > > > > > > > > > And how is this handled in the interrupt case? Interrupts are not > > > > > disabled here. > > > > > > > > Actually, on most architectures interrupts are in fact disabled: > > > > > > > > #define this_cpu_generic_to_op(pcp, val, op) > > > > \ > > > > do { > > > > \ > > > > unsigned long __flags; > > > > \ > > > > raw_local_irq_save(__flags); > > > > \ > > > > raw_cpu_generic_to_op(pcp, val, op); > > > > \ > > > > raw_local_irq_restore(__flags); > > > > \ > > > > } while (0) > > > > > > > > NMIs, not so much. > > > > > > And do these archs have NMIs? > > > > It would appear so: > > Well the next question is, which of these archs that use it are in this > list. > > > $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \; > > ./arch/sparc/Kconfig > > ./arch/s390/Kconfig > > ./arch/arm/Kconfig > > ./arch/arm64/Kconfig > > ./arch/mips/Kconfig > > ./arch/sh/Kconfig > > ./arch/powerpc/Kconfig > > Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a > priority higher than other interrupts. Plus as you say below, its local_inc() is atomic, and thus NMI-safe, and thus the _nmi() approach would work. > > ./arch/x86/Kconfig > > And we get this: > > $ git grep this_cpu_add_4 > arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > _percpu_add(pcp, val) > arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > arch_this_cpu_to_op_simple(pcp, val, +) > arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > arch_this_cpu_add(pcp, val, "laa", "asi", int) > arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > percpu_add_op((pcp), val) > > Which leaves us with sparc, arm, mips, sh and powerpc. > > sh is almost dead, and powerpc can be fixed, which I guess leaves us > with sparc, arm and mips. If we want to stick with the current srcu_read_lock() and srcu_read_unlock(), you mean? I would like that sort of outcome, at least assuming we are not hammering any of the architectures. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 11:23:09AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 08:05:58 -0700 > "Paul E. McKenney" wrote: > > > On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote: > > > On Wed, 8 Aug 2018 07:33:10 -0700 > > > "Paul E. McKenney" wrote: > > > > > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > > > > On Wed, 8 Aug 2018 06:03:02 -0700 > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. > > > > > > Although > > > > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > > > > interrupts being disabled, thus an NMI is no different than any > > > > > > > interrupt doing the same thing, right? > > > > > > > > > > > > On architectures without increment-memory instructions, if you take > > > > > > an NMI > > > > > > between the load from sp->sda->srcu_lock_count and the later store, > > > > > > you > > > > > > lose a count. Note that both __srcu_read_lock() and > > > > > > __srcu_read_unlock() > > > > > > do increments of different locations, so you cannot rely on the > > > > > > usual > > > > > > "NMI fixes up before exit" semantics you get when incrementing and > > > > > > decrementing the same location. > > > > > > > > > > And how is this handled in the interrupt case? Interrupts are not > > > > > disabled here. > > > > > > > > Actually, on most architectures interrupts are in fact disabled: > > > > > > > > #define this_cpu_generic_to_op(pcp, val, op) > > > > \ > > > > do { > > > > \ > > > > unsigned long __flags; > > > > \ > > > > raw_local_irq_save(__flags); > > > > \ > > > > raw_cpu_generic_to_op(pcp, val, op); > > > > \ > > > > raw_local_irq_restore(__flags); > > > > \ > > > > } while (0) > > > > > > > > NMIs, not so much. > > > > > > And do these archs have NMIs? > > > > It would appear so: > > Well the next question is, which of these archs that use it are in this > list. > > > $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \; > > ./arch/sparc/Kconfig > > ./arch/s390/Kconfig > > ./arch/arm/Kconfig > > ./arch/arm64/Kconfig > > ./arch/mips/Kconfig > > ./arch/sh/Kconfig > > ./arch/powerpc/Kconfig > > Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a > priority higher than other interrupts. Plus as you say below, its local_inc() is atomic, and thus NMI-safe, and thus the _nmi() approach would work. > > ./arch/x86/Kconfig > > And we get this: > > $ git grep this_cpu_add_4 > arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > _percpu_add(pcp, val) > arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > arch_this_cpu_to_op_simple(pcp, val, +) > arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > arch_this_cpu_add(pcp, val, "laa", "asi", int) > arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) > percpu_add_op((pcp), val) > > Which leaves us with sparc, arm, mips, sh and powerpc. > > sh is almost dead, and powerpc can be fixed, which I guess leaves us > with sparc, arm and mips. If we want to stick with the current srcu_read_lock() and srcu_read_unlock(), you mean? I would like that sort of outcome, at least assuming we are not hammering any of the architectures. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 07:42:00 -0700 "Paul E. McKenney" wrote: > > There's also a local_inc() if you are using per cpu pointers, that is > > suppose to guarantee atomicity for single cpu operations. That's what > > the ftrace ring buffer uses. > > Good point, that becomes atomic_long_inc() or equivalent on most > architectures, but an incl instruction (not locked) on x86. So updating > my earlier still-untested thought: > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > local_inc(>sda->srcu_lock_count[idx]); > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > return idx; > } > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > { > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > local_inc(>sda->srcu_unlock_count[idx]); > } > > Would that work, or is there a better way to handle this? This would work much better than using the atomic ops, and I think it would be doable. I may need to revert the srcu for trace_*_rcidle() for now, as I want most of the other changes in this merge window, and it's getting too late to do it with these updates. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 07:42:00 -0700 "Paul E. McKenney" wrote: > > There's also a local_inc() if you are using per cpu pointers, that is > > suppose to guarantee atomicity for single cpu operations. That's what > > the ftrace ring buffer uses. > > Good point, that becomes atomic_long_inc() or equivalent on most > architectures, but an incl instruction (not locked) on x86. So updating > my earlier still-untested thought: > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > local_inc(>sda->srcu_lock_count[idx]); > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > return idx; > } > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > { > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > local_inc(>sda->srcu_unlock_count[idx]); > } > > Would that work, or is there a better way to handle this? This would work much better than using the atomic ops, and I think it would be doable. I may need to revert the srcu for trace_*_rcidle() for now, as I want most of the other changes in this merge window, and it's getting too late to do it with these updates. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 08:05:58 -0700 "Paul E. McKenney" wrote: > On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote: > > On Wed, 8 Aug 2018 07:33:10 -0700 > > "Paul E. McKenney" wrote: > > > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > > > On Wed, 8 Aug 2018 06:03:02 -0700 > > > > "Paul E. McKenney" wrote: > > > > > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. > > > > > Although > > > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > > > interrupts being disabled, thus an NMI is no different than any > > > > > > interrupt doing the same thing, right? > > > > > > > > > > On architectures without increment-memory instructions, if you take > > > > > an NMI > > > > > between the load from sp->sda->srcu_lock_count and the later store, > > > > > you > > > > > lose a count. Note that both __srcu_read_lock() and > > > > > __srcu_read_unlock() > > > > > do increments of different locations, so you cannot rely on the usual > > > > > "NMI fixes up before exit" semantics you get when incrementing and > > > > > decrementing the same location. > > > > > > > > And how is this handled in the interrupt case? Interrupts are not > > > > disabled here. > > > > > > Actually, on most architectures interrupts are in fact disabled: > > > > > > #define this_cpu_generic_to_op(pcp, val, op) > > > \ > > > do { > > > \ > > > unsigned long __flags; \ > > > raw_local_irq_save(__flags);\ > > > raw_cpu_generic_to_op(pcp, val, op);\ > > > raw_local_irq_restore(__flags); \ > > > } while (0) > > > > > > NMIs, not so much. > > > > And do these archs have NMIs? > > It would appear so: Well the next question is, which of these archs that use it are in this list. > > $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \; > ./arch/sparc/Kconfig > ./arch/s390/Kconfig > ./arch/arm/Kconfig > ./arch/arm64/Kconfig > ./arch/mips/Kconfig > ./arch/sh/Kconfig > ./arch/powerpc/Kconfig Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a priority higher than other interrupts. > ./arch/x86/Kconfig > And we get this: $ git grep this_cpu_add_4 arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) _percpu_add(pcp, val) arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) arch_this_cpu_to_op_simple(pcp, val, +) arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) arch_this_cpu_add(pcp, val, "laa", "asi", int) arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) percpu_add_op((pcp), val) Which leaves us with sparc, arm, mips, sh and powerpc. sh is almost dead, and powerpc can be fixed, which I guess leaves us with sparc, arm and mips. -- Steve > Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 08:05:58 -0700 "Paul E. McKenney" wrote: > On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote: > > On Wed, 8 Aug 2018 07:33:10 -0700 > > "Paul E. McKenney" wrote: > > > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > > > On Wed, 8 Aug 2018 06:03:02 -0700 > > > > "Paul E. McKenney" wrote: > > > > > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. > > > > > Although > > > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > > > interrupts being disabled, thus an NMI is no different than any > > > > > > interrupt doing the same thing, right? > > > > > > > > > > On architectures without increment-memory instructions, if you take > > > > > an NMI > > > > > between the load from sp->sda->srcu_lock_count and the later store, > > > > > you > > > > > lose a count. Note that both __srcu_read_lock() and > > > > > __srcu_read_unlock() > > > > > do increments of different locations, so you cannot rely on the usual > > > > > "NMI fixes up before exit" semantics you get when incrementing and > > > > > decrementing the same location. > > > > > > > > And how is this handled in the interrupt case? Interrupts are not > > > > disabled here. > > > > > > Actually, on most architectures interrupts are in fact disabled: > > > > > > #define this_cpu_generic_to_op(pcp, val, op) > > > \ > > > do { > > > \ > > > unsigned long __flags; \ > > > raw_local_irq_save(__flags);\ > > > raw_cpu_generic_to_op(pcp, val, op);\ > > > raw_local_irq_restore(__flags); \ > > > } while (0) > > > > > > NMIs, not so much. > > > > And do these archs have NMIs? > > It would appear so: Well the next question is, which of these archs that use it are in this list. > > $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \; > ./arch/sparc/Kconfig > ./arch/s390/Kconfig > ./arch/arm/Kconfig > ./arch/arm64/Kconfig > ./arch/mips/Kconfig > ./arch/sh/Kconfig > ./arch/powerpc/Kconfig Note, I know that powerpc "imitates" an NMI. It just sets the NMI as a priority higher than other interrupts. > ./arch/x86/Kconfig > And we get this: $ git grep this_cpu_add_4 arch/arm64/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) _percpu_add(pcp, val) arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) arch_this_cpu_to_op_simple(pcp, val, +) arch/s390/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) arch_this_cpu_add(pcp, val, "laa", "asi", int) arch/x86/include/asm/percpu.h:#define this_cpu_add_4(pcp, val) percpu_add_op((pcp), val) Which leaves us with sparc, arm, mips, sh and powerpc. sh is almost dead, and powerpc can be fixed, which I guess leaves us with sparc, arm and mips. -- Steve > Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 07:33:10 -0700 > "Paul E. McKenney" wrote: > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > > On Wed, 8 Aug 2018 06:03:02 -0700 > > > "Paul E. McKenney" wrote: > > > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > > interrupts being disabled, thus an NMI is no different than any > > > > > interrupt doing the same thing, right? > > > > > > > > On architectures without increment-memory instructions, if you take an > > > > NMI > > > > between the load from sp->sda->srcu_lock_count and the later store, you > > > > lose a count. Note that both __srcu_read_lock() and > > > > __srcu_read_unlock() > > > > do increments of different locations, so you cannot rely on the usual > > > > "NMI fixes up before exit" semantics you get when incrementing and > > > > decrementing the same location. > > > > > > And how is this handled in the interrupt case? Interrupts are not > > > disabled here. > > > > Actually, on most architectures interrupts are in fact disabled: > > > > #define this_cpu_generic_to_op(pcp, val, op) > > \ > > do { > > \ > > unsigned long __flags; \ > > raw_local_irq_save(__flags);\ > > raw_cpu_generic_to_op(pcp, val, op);\ > > raw_local_irq_restore(__flags); \ > > } while (0) > > > > NMIs, not so much. > > And do these archs have NMIs? It would appear so: $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \; ./arch/sparc/Kconfig ./arch/s390/Kconfig ./arch/arm/Kconfig ./arch/arm64/Kconfig ./arch/mips/Kconfig ./arch/sh/Kconfig ./arch/powerpc/Kconfig ./arch/x86/Kconfig Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 10:49:10AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 07:33:10 -0700 > "Paul E. McKenney" wrote: > > > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > > On Wed, 8 Aug 2018 06:03:02 -0700 > > > "Paul E. McKenney" wrote: > > > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > > interrupts being disabled, thus an NMI is no different than any > > > > > interrupt doing the same thing, right? > > > > > > > > On architectures without increment-memory instructions, if you take an > > > > NMI > > > > between the load from sp->sda->srcu_lock_count and the later store, you > > > > lose a count. Note that both __srcu_read_lock() and > > > > __srcu_read_unlock() > > > > do increments of different locations, so you cannot rely on the usual > > > > "NMI fixes up before exit" semantics you get when incrementing and > > > > decrementing the same location. > > > > > > And how is this handled in the interrupt case? Interrupts are not > > > disabled here. > > > > Actually, on most architectures interrupts are in fact disabled: > > > > #define this_cpu_generic_to_op(pcp, val, op) > > \ > > do { > > \ > > unsigned long __flags; \ > > raw_local_irq_save(__flags);\ > > raw_cpu_generic_to_op(pcp, val, op);\ > > raw_local_irq_restore(__flags); \ > > } while (0) > > > > NMIs, not so much. > > And do these archs have NMIs? It would appear so: $ find . -name 'Kconfig*' -exec grep -l 'select HAVE_NMI\>' {} \; ./arch/sparc/Kconfig ./arch/s390/Kconfig ./arch/arm/Kconfig ./arch/arm64/Kconfig ./arch/mips/Kconfig ./arch/sh/Kconfig ./arch/powerpc/Kconfig ./arch/x86/Kconfig Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 07:10:53AM -0700, Joel Fernandes wrote: > On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney > wrote: > > On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote: > >> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: > >> > Hi Steve, > >> > > >> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt > >> > wrote: > >> [...] > >> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > >> >>> } while ((++it_func_ptr)->func);\ > >> >>> } \ > >> >>> \ > >> >>> - if (rcuidle)\ > >> >>> - srcu_read_unlock_notrace(_srcu, idx);\ > >> >>> + srcu_read_unlock_notrace(ss, idx); \ > >> >> > >> >> Hmm, why do we have the two different srcu handles? > >> > > >> > Because if the memory operations happening on the normal SRCU handle > >> > (during srcu_read_lock) is interrupted by NMI, then the other handle > >> > (devoted to NMI) could be used instead and not bother the interrupted > >> > handle. Does that makes sense? > >> > > >> > When I talked to Paul few months ago about SRCU from NMI context, he > >> > mentioned the per-cpu memory operations during srcu_read_lock can be > >> > NMI interrupted, that's why we added that warning. > >> > >> So I looked more closely, __srcu_read_lock on 2 different handles may > >> still be doing a this_cpu_inc on the same location.. > >> (sp->sda->srcu_lock_count). :-( > >> > >> Paul any ideas on how to solve this? > > > > You lost me on this one. When you said "2 different handles", I assumed > > that you meant two different values of "sp", which would have two > > different addresses for >sda->srcu_lock_count. What am I missing? > > Thanks a lot for the reply. > I thought "sda" is the same for different srcu_struct(s). May be it > was too late for me in the night, that's why I thought so? Which makes > no sense now that I think of it. I know that feeling! ;-) > In that case based on what you're saying, the patch I sent to using > different srcu_struct for NMI is still good I guess... As long as you wait for both SRCU grace periods. Hmmm... Maybe that means that there is still a use for synchronize_rcu_mult(): void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func) { call_srcu(_srcu_struct_nmi, rhp, func); } void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func) { call_srcu(_srcu_struct_nonmi, rhp, func); } ... /* Wait concurrently on the two grace periods. */ synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi); On the other hand, I bet that doing this is just fine in your use case: synchronize_srcu(_srcu_struct_nmi); synchronize_srcu(_srcu_struct_nonmi); But please note that synchronize_rcu_mult() is no more in my -rcu tree, so if you do want it please let me know (and please let me know why it is important). > >> It does start to seem like a show stopper :-( > > > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > > whether this would be fast enough to be useful, but easy to provide: > > > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > > { > > int idx; > > > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > > atomic_inc(>sda->srcu_lock_count[idx]); > > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. > > */ > > return idx; > > } > > > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > > { > > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical > > section. */ > > atomic_inc(>sda->srcu_unlock_count[idx]); > > } > > > > With appropriate adjustments to also allow Tiny RCU to also work. > > > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > > In fact, the NMI handlers are the one place you -don't- need to use > > _nmi(), strangely enough. > > > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > > some architectures, for example. > > Continuing Steve's question on regular interrupts, do we need to use > this atomic_inc API for regular interrupts as well? If NMIs use one srcu_struct and non-NMI uses another, the current srcu_read_lock() and srcu_read_unlock() will work just fine. If any given srcu_struct needs both NMI and non-NMI readers, then we really do need __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 07:10:53AM -0700, Joel Fernandes wrote: > On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney > wrote: > > On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote: > >> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: > >> > Hi Steve, > >> > > >> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt > >> > wrote: > >> [...] > >> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > >> >>> } while ((++it_func_ptr)->func);\ > >> >>> } \ > >> >>> \ > >> >>> - if (rcuidle)\ > >> >>> - srcu_read_unlock_notrace(_srcu, idx);\ > >> >>> + srcu_read_unlock_notrace(ss, idx); \ > >> >> > >> >> Hmm, why do we have the two different srcu handles? > >> > > >> > Because if the memory operations happening on the normal SRCU handle > >> > (during srcu_read_lock) is interrupted by NMI, then the other handle > >> > (devoted to NMI) could be used instead and not bother the interrupted > >> > handle. Does that makes sense? > >> > > >> > When I talked to Paul few months ago about SRCU from NMI context, he > >> > mentioned the per-cpu memory operations during srcu_read_lock can be > >> > NMI interrupted, that's why we added that warning. > >> > >> So I looked more closely, __srcu_read_lock on 2 different handles may > >> still be doing a this_cpu_inc on the same location.. > >> (sp->sda->srcu_lock_count). :-( > >> > >> Paul any ideas on how to solve this? > > > > You lost me on this one. When you said "2 different handles", I assumed > > that you meant two different values of "sp", which would have two > > different addresses for >sda->srcu_lock_count. What am I missing? > > Thanks a lot for the reply. > I thought "sda" is the same for different srcu_struct(s). May be it > was too late for me in the night, that's why I thought so? Which makes > no sense now that I think of it. I know that feeling! ;-) > In that case based on what you're saying, the patch I sent to using > different srcu_struct for NMI is still good I guess... As long as you wait for both SRCU grace periods. Hmmm... Maybe that means that there is still a use for synchronize_rcu_mult(): void call_srcu_nmi(struct rcu_head *rhp, rcu_callback_t func) { call_srcu(_srcu_struct_nmi, rhp, func); } void call_srcu_nonmi(struct rcu_head *rhp, rcu_callback_t func) { call_srcu(_srcu_struct_nonmi, rhp, func); } ... /* Wait concurrently on the two grace periods. */ synchronize_rcu_mult(call_srcu_nmi, call_srcu_nonmi); On the other hand, I bet that doing this is just fine in your use case: synchronize_srcu(_srcu_struct_nmi); synchronize_srcu(_srcu_struct_nonmi); But please note that synchronize_rcu_mult() is no more in my -rcu tree, so if you do want it please let me know (and please let me know why it is important). > >> It does start to seem like a show stopper :-( > > > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > > whether this would be fast enough to be useful, but easy to provide: > > > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > > { > > int idx; > > > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > > atomic_inc(>sda->srcu_lock_count[idx]); > > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. > > */ > > return idx; > > } > > > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > > { > > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical > > section. */ > > atomic_inc(>sda->srcu_unlock_count[idx]); > > } > > > > With appropriate adjustments to also allow Tiny RCU to also work. > > > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > > In fact, the NMI handlers are the one place you -don't- need to use > > _nmi(), strangely enough. > > > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > > some architectures, for example. > > Continuing Steve's question on regular interrupts, do we need to use > this atomic_inc API for regular interrupts as well? If NMIs use one srcu_struct and non-NMI uses another, the current srcu_read_lock() and srcu_read_unlock() will work just fine. If any given srcu_struct needs both NMI and non-NMI readers, then we really do need __srcu_read_lock_nmi() and __srcu_read_unlock_nmi() for that srcu_struct. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 07:33:10 -0700 "Paul E. McKenney" wrote: > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > On Wed, 8 Aug 2018 06:03:02 -0700 > > "Paul E. McKenney" wrote: > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > interrupts being disabled, thus an NMI is no different than any > > > > interrupt doing the same thing, right? > > > > > > On architectures without increment-memory instructions, if you take an NMI > > > between the load from sp->sda->srcu_lock_count and the later store, you > > > lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() > > > do increments of different locations, so you cannot rely on the usual > > > "NMI fixes up before exit" semantics you get when incrementing and > > > decrementing the same location. > > > > And how is this handled in the interrupt case? Interrupts are not > > disabled here. > > Actually, on most architectures interrupts are in fact disabled: > > #define this_cpu_generic_to_op(pcp, val, op) \ > do { \ > unsigned long __flags; \ > raw_local_irq_save(__flags);\ > raw_cpu_generic_to_op(pcp, val, op);\ > raw_local_irq_restore(__flags); \ > } while (0) > > NMIs, not so much. And do these archs have NMIs? -- Steve > > > I would also argue that architectures without increment-memory > > instructions shouldn't have NMIs ;-) > > I would also argue a lot of things, but objective reality does not take my > opinions into account all that often. Which might be a good thing. ;-) > > Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 07:33:10 -0700 "Paul E. McKenney" wrote: > On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > > On Wed, 8 Aug 2018 06:03:02 -0700 > > "Paul E. McKenney" wrote: > > > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > > > it wont be atomic for the capture of the idx. But I also don't see > > > > interrupts being disabled, thus an NMI is no different than any > > > > interrupt doing the same thing, right? > > > > > > On architectures without increment-memory instructions, if you take an NMI > > > between the load from sp->sda->srcu_lock_count and the later store, you > > > lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() > > > do increments of different locations, so you cannot rely on the usual > > > "NMI fixes up before exit" semantics you get when incrementing and > > > decrementing the same location. > > > > And how is this handled in the interrupt case? Interrupts are not > > disabled here. > > Actually, on most architectures interrupts are in fact disabled: > > #define this_cpu_generic_to_op(pcp, val, op) \ > do { \ > unsigned long __flags; \ > raw_local_irq_save(__flags);\ > raw_cpu_generic_to_op(pcp, val, op);\ > raw_local_irq_restore(__flags); \ > } while (0) > > NMIs, not so much. And do these archs have NMIs? -- Steve > > > I would also argue that architectures without increment-memory > > instructions shouldn't have NMIs ;-) > > I would also argue a lot of things, but objective reality does not take my > opinions into account all that often. Which might be a good thing. ;-) > > Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 10:27:00AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 06:00:41 -0700 > "Paul E. McKenney" wrote: > > > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > > whether this would be fast enough to be useful, but easy to provide: > > > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > > { > > int idx; > > > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > > atomic_inc(>sda->srcu_lock_count[idx]); > > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > > return idx; > > } > > > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > > { > > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > > atomic_inc(>sda->srcu_unlock_count[idx]); > > } > > > > With appropriate adjustments to also allow Tiny RCU to also work. > > > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > > In fact, the NMI handlers are the one place you -don't- need to use > > _nmi(), strangely enough. > > > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > > some architectures, for example. > > Note this would kill the performance that srcu gives that Joel wants. > Switching from a this_cpu_inc() to a atomic_inc() would be a huge > impact. I don't know how huge it would be, but that concern is exactly why I am proposing adding _nmi() interfaces rather than just directly changing the stock __srcu_read_lock() and __srcu_read_unlock() functions. > There's also a local_inc() if you are using per cpu pointers, that is > suppose to guarantee atomicity for single cpu operations. That's what > the ftrace ring buffer uses. Good point, that becomes atomic_long_inc() or equivalent on most architectures, but an incl instruction (not locked) on x86. So updating my earlier still-untested thought: int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ { int idx; idx = READ_ONCE(sp->srcu_idx) & 0x1; local_inc(>sda->srcu_lock_count[idx]); smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ return idx; } void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) { smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ local_inc(>sda->srcu_unlock_count[idx]); } Would that work, or is there a better way to handle this? Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 10:27:00AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 06:00:41 -0700 > "Paul E. McKenney" wrote: > > > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > > whether this would be fast enough to be useful, but easy to provide: > > > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > > { > > int idx; > > > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > > atomic_inc(>sda->srcu_lock_count[idx]); > > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > > return idx; > > } > > > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > > { > > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > > atomic_inc(>sda->srcu_unlock_count[idx]); > > } > > > > With appropriate adjustments to also allow Tiny RCU to also work. > > > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > > In fact, the NMI handlers are the one place you -don't- need to use > > _nmi(), strangely enough. > > > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > > some architectures, for example. > > Note this would kill the performance that srcu gives that Joel wants. > Switching from a this_cpu_inc() to a atomic_inc() would be a huge > impact. I don't know how huge it would be, but that concern is exactly why I am proposing adding _nmi() interfaces rather than just directly changing the stock __srcu_read_lock() and __srcu_read_unlock() functions. > There's also a local_inc() if you are using per cpu pointers, that is > suppose to guarantee atomicity for single cpu operations. That's what > the ftrace ring buffer uses. Good point, that becomes atomic_long_inc() or equivalent on most architectures, but an incl instruction (not locked) on x86. So updating my earlier still-untested thought: int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ { int idx; idx = READ_ONCE(sp->srcu_idx) & 0x1; local_inc(>sda->srcu_lock_count[idx]); smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ return idx; } void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) { smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ local_inc(>sda->srcu_unlock_count[idx]); } Would that work, or is there a better way to handle this? Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 06:03:02 -0700 > "Paul E. McKenney" wrote: > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > > it wont be atomic for the capture of the idx. But I also don't see > > > interrupts being disabled, thus an NMI is no different than any > > > interrupt doing the same thing, right? > > > > On architectures without increment-memory instructions, if you take an NMI > > between the load from sp->sda->srcu_lock_count and the later store, you > > lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() > > do increments of different locations, so you cannot rely on the usual > > "NMI fixes up before exit" semantics you get when incrementing and > > decrementing the same location. > > And how is this handled in the interrupt case? Interrupts are not > disabled here. Actually, on most architectures interrupts are in fact disabled: #define this_cpu_generic_to_op(pcp, val, op)\ do {\ unsigned long __flags; \ raw_local_irq_save(__flags);\ raw_cpu_generic_to_op(pcp, val, op);\ raw_local_irq_restore(__flags); \ } while (0) NMIs, not so much. > I would also argue that architectures without increment-memory > instructions shouldn't have NMIs ;-) I would also argue a lot of things, but objective reality does not take my opinions into account all that often. Which might be a good thing. ;-) Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 09:07:24AM -0400, Steven Rostedt wrote: > On Wed, 8 Aug 2018 06:03:02 -0700 > "Paul E. McKenney" wrote: > > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > > it wont be atomic for the capture of the idx. But I also don't see > > > interrupts being disabled, thus an NMI is no different than any > > > interrupt doing the same thing, right? > > > > On architectures without increment-memory instructions, if you take an NMI > > between the load from sp->sda->srcu_lock_count and the later store, you > > lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() > > do increments of different locations, so you cannot rely on the usual > > "NMI fixes up before exit" semantics you get when incrementing and > > decrementing the same location. > > And how is this handled in the interrupt case? Interrupts are not > disabled here. Actually, on most architectures interrupts are in fact disabled: #define this_cpu_generic_to_op(pcp, val, op)\ do {\ unsigned long __flags; \ raw_local_irq_save(__flags);\ raw_cpu_generic_to_op(pcp, val, op);\ raw_local_irq_restore(__flags); \ } while (0) NMIs, not so much. > I would also argue that architectures without increment-memory > instructions shouldn't have NMIs ;-) I would also argue a lot of things, but objective reality does not take my opinions into account all that often. Which might be a good thing. ;-) Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 06:00:41 -0700 "Paul E. McKenney" wrote: > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > whether this would be fast enough to be useful, but easy to provide: > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > atomic_inc(>sda->srcu_lock_count[idx]); > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > return idx; > } > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > { > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > atomic_inc(>sda->srcu_unlock_count[idx]); > } > > With appropriate adjustments to also allow Tiny RCU to also work. > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > In fact, the NMI handlers are the one place you -don't- need to use > _nmi(), strangely enough. > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > some architectures, for example. Note this would kill the performance that srcu gives that Joel wants. Switching from a this_cpu_inc() to a atomic_inc() would be a huge impact. There's also a local_inc() if you are using per cpu pointers, that is suppose to guarantee atomicity for single cpu operations. That's what the ftrace ring buffer uses. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 06:00:41 -0700 "Paul E. McKenney" wrote: > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > whether this would be fast enough to be useful, but easy to provide: > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > atomic_inc(>sda->srcu_lock_count[idx]); > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > return idx; > } > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > { > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ > atomic_inc(>sda->srcu_unlock_count[idx]); > } > > With appropriate adjustments to also allow Tiny RCU to also work. > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > In fact, the NMI handlers are the one place you -don't- need to use > _nmi(), strangely enough. > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > some architectures, for example. Note this would kill the performance that srcu gives that Joel wants. Switching from a this_cpu_inc() to a atomic_inc() would be a huge impact. There's also a local_inc() if you are using per cpu pointers, that is suppose to guarantee atomicity for single cpu operations. That's what the ftrace ring buffer uses. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney wrote: > On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote: >> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: >> > Hi Steve, >> > >> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: >> [...] >> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); >> >>> } while ((++it_func_ptr)->func);\ >> >>> } \ >> >>> \ >> >>> - if (rcuidle)\ >> >>> - srcu_read_unlock_notrace(_srcu, idx);\ >> >>> + srcu_read_unlock_notrace(ss, idx); \ >> >> >> >> Hmm, why do we have the two different srcu handles? >> > >> > Because if the memory operations happening on the normal SRCU handle >> > (during srcu_read_lock) is interrupted by NMI, then the other handle >> > (devoted to NMI) could be used instead and not bother the interrupted >> > handle. Does that makes sense? >> > >> > When I talked to Paul few months ago about SRCU from NMI context, he >> > mentioned the per-cpu memory operations during srcu_read_lock can be >> > NMI interrupted, that's why we added that warning. >> >> So I looked more closely, __srcu_read_lock on 2 different handles may >> still be doing a this_cpu_inc on the same location.. >> (sp->sda->srcu_lock_count). :-( >> >> Paul any ideas on how to solve this? > > You lost me on this one. When you said "2 different handles", I assumed > that you meant two different values of "sp", which would have two > different addresses for >sda->srcu_lock_count. What am I missing? Thanks a lot for the reply. I thought "sda" is the same for different srcu_struct(s). May be it was too late for me in the night, that's why I thought so? Which makes no sense now that I think of it. In that case based on what you're saying, the patch I sent to using different srcu_struct for NMI is still good I guess... >> It does start to seem like a show stopper :-( > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > whether this would be fast enough to be useful, but easy to provide: > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > atomic_inc(>sda->srcu_lock_count[idx]); > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > return idx; > } > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > { > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. > */ > atomic_inc(>sda->srcu_unlock_count[idx]); > } > > With appropriate adjustments to also allow Tiny RCU to also work. > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > In fact, the NMI handlers are the one place you -don't- need to use > _nmi(), strangely enough. > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > some architectures, for example. Continuing Steve's question on regular interrupts, do we need to use this atomic_inc API for regular interrupts as well? Thanks!
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 8, 2018 at 6:00 AM, Paul E. McKenney wrote: > On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote: >> On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: >> > Hi Steve, >> > >> > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: >> [...] >> >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); >> >>> } while ((++it_func_ptr)->func);\ >> >>> } \ >> >>> \ >> >>> - if (rcuidle)\ >> >>> - srcu_read_unlock_notrace(_srcu, idx);\ >> >>> + srcu_read_unlock_notrace(ss, idx); \ >> >> >> >> Hmm, why do we have the two different srcu handles? >> > >> > Because if the memory operations happening on the normal SRCU handle >> > (during srcu_read_lock) is interrupted by NMI, then the other handle >> > (devoted to NMI) could be used instead and not bother the interrupted >> > handle. Does that makes sense? >> > >> > When I talked to Paul few months ago about SRCU from NMI context, he >> > mentioned the per-cpu memory operations during srcu_read_lock can be >> > NMI interrupted, that's why we added that warning. >> >> So I looked more closely, __srcu_read_lock on 2 different handles may >> still be doing a this_cpu_inc on the same location.. >> (sp->sda->srcu_lock_count). :-( >> >> Paul any ideas on how to solve this? > > You lost me on this one. When you said "2 different handles", I assumed > that you meant two different values of "sp", which would have two > different addresses for >sda->srcu_lock_count. What am I missing? Thanks a lot for the reply. I thought "sda" is the same for different srcu_struct(s). May be it was too late for me in the night, that's why I thought so? Which makes no sense now that I think of it. In that case based on what you're saying, the patch I sent to using different srcu_struct for NMI is still good I guess... >> It does start to seem like a show stopper :-( > > I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could > be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure > whether this would be fast enough to be useful, but easy to provide: > > int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ > { > int idx; > > idx = READ_ONCE(sp->srcu_idx) & 0x1; > atomic_inc(>sda->srcu_lock_count[idx]); > smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ > return idx; > } > > void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) > { > smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. > */ > atomic_inc(>sda->srcu_unlock_count[idx]); > } > > With appropriate adjustments to also allow Tiny RCU to also work. > > Note that you have to use _nmi() everywhere, not just in NMI handlers. > In fact, the NMI handlers are the one place you -don't- need to use > _nmi(), strangely enough. > > Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on > some architectures, for example. Continuing Steve's question on regular interrupts, do we need to use this atomic_inc API for regular interrupts as well? Thanks!
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 06:03:02 -0700 "Paul E. McKenney" wrote: > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > it wont be atomic for the capture of the idx. But I also don't see > > interrupts being disabled, thus an NMI is no different than any > > interrupt doing the same thing, right? > > On architectures without increment-memory instructions, if you take an NMI > between the load from sp->sda->srcu_lock_count and the later store, you > lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() > do increments of different locations, so you cannot rely on the usual > "NMI fixes up before exit" semantics you get when incrementing and > decrementing the same location. And how is this handled in the interrupt case? Interrupts are not disabled here. I would also argue that architectures without increment-memory instructions shouldn't have NMIs ;-) -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, 8 Aug 2018 06:03:02 -0700 "Paul E. McKenney" wrote: > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > > it wont be atomic for the capture of the idx. But I also don't see > > interrupts being disabled, thus an NMI is no different than any > > interrupt doing the same thing, right? > > On architectures without increment-memory instructions, if you take an NMI > between the load from sp->sda->srcu_lock_count and the later store, you > lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() > do increments of different locations, so you cannot rely on the usual > "NMI fixes up before exit" semantics you get when incrementing and > decrementing the same location. And how is this handled in the interrupt case? Interrupts are not disabled here. I would also argue that architectures without increment-memory instructions shouldn't have NMIs ;-) -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 08:46:29AM -0400, Steven Rostedt wrote: > On Tue, 7 Aug 2018 20:53:54 -0700 > Joel Fernandes wrote: > > > > > When I talked to Paul few months ago about SRCU from NMI context, he > > > mentioned the per-cpu memory operations during srcu_read_lock can be > > > NMI interrupted, that's why we added that warning. > > > > So I looked more closely, __srcu_read_lock on 2 different handles may > > still be doing a this_cpu_inc on the same location.. > > (sp->sda->srcu_lock_count). :-( > > > > Paul any ideas on how to solve this? > > > > It does start to seem like a show stopper :-( > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > it wont be atomic for the capture of the idx. But I also don't see > interrupts being disabled, thus an NMI is no different than any > interrupt doing the same thing, right? On architectures without increment-memory instructions, if you take an NMI between the load from sp->sda->srcu_lock_count and the later store, you lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() do increments of different locations, so you cannot rely on the usual "NMI fixes up before exit" semantics you get when incrementing and decrementing the same location. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Wed, Aug 08, 2018 at 08:46:29AM -0400, Steven Rostedt wrote: > On Tue, 7 Aug 2018 20:53:54 -0700 > Joel Fernandes wrote: > > > > > When I talked to Paul few months ago about SRCU from NMI context, he > > > mentioned the per-cpu memory operations during srcu_read_lock can be > > > NMI interrupted, that's why we added that warning. > > > > So I looked more closely, __srcu_read_lock on 2 different handles may > > still be doing a this_cpu_inc on the same location.. > > (sp->sda->srcu_lock_count). :-( > > > > Paul any ideas on how to solve this? > > > > It does start to seem like a show stopper :-( > > What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although > it wont be atomic for the capture of the idx. But I also don't see > interrupts being disabled, thus an NMI is no different than any > interrupt doing the same thing, right? On architectures without increment-memory instructions, if you take an NMI between the load from sp->sda->srcu_lock_count and the later store, you lose a count. Note that both __srcu_read_lock() and __srcu_read_unlock() do increments of different locations, so you cannot rely on the usual "NMI fixes up before exit" semantics you get when incrementing and decrementing the same location. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote: > On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: > > Hi Steve, > > > > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: > [...] > >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > >>> } while ((++it_func_ptr)->func);\ > >>> } \ > >>> \ > >>> - if (rcuidle)\ > >>> - srcu_read_unlock_notrace(_srcu, idx);\ > >>> + srcu_read_unlock_notrace(ss, idx); \ > >> > >> Hmm, why do we have the two different srcu handles? > > > > Because if the memory operations happening on the normal SRCU handle > > (during srcu_read_lock) is interrupted by NMI, then the other handle > > (devoted to NMI) could be used instead and not bother the interrupted > > handle. Does that makes sense? > > > > When I talked to Paul few months ago about SRCU from NMI context, he > > mentioned the per-cpu memory operations during srcu_read_lock can be > > NMI interrupted, that's why we added that warning. > > So I looked more closely, __srcu_read_lock on 2 different handles may > still be doing a this_cpu_inc on the same location.. > (sp->sda->srcu_lock_count). :-( > > Paul any ideas on how to solve this? You lost me on this one. When you said "2 different handles", I assumed that you meant two different values of "sp", which would have two different addresses for >sda->srcu_lock_count. What am I missing? > It does start to seem like a show stopper :-( I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure whether this would be fast enough to be useful, but easy to provide: int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ { int idx; idx = READ_ONCE(sp->srcu_idx) & 0x1; atomic_inc(>sda->srcu_lock_count[idx]); smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ return idx; } void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) { smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ atomic_inc(>sda->srcu_unlock_count[idx]); } With appropriate adjustments to also allow Tiny RCU to also work. Note that you have to use _nmi() everywhere, not just in NMI handlers. In fact, the NMI handlers are the one place you -don't- need to use _nmi(), strangely enough. Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on some architectures, for example. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 07, 2018 at 08:53:54PM -0700, Joel Fernandes wrote: > On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: > > Hi Steve, > > > > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: > [...] > >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > >>> } while ((++it_func_ptr)->func);\ > >>> } \ > >>> \ > >>> - if (rcuidle)\ > >>> - srcu_read_unlock_notrace(_srcu, idx);\ > >>> + srcu_read_unlock_notrace(ss, idx); \ > >> > >> Hmm, why do we have the two different srcu handles? > > > > Because if the memory operations happening on the normal SRCU handle > > (during srcu_read_lock) is interrupted by NMI, then the other handle > > (devoted to NMI) could be used instead and not bother the interrupted > > handle. Does that makes sense? > > > > When I talked to Paul few months ago about SRCU from NMI context, he > > mentioned the per-cpu memory operations during srcu_read_lock can be > > NMI interrupted, that's why we added that warning. > > So I looked more closely, __srcu_read_lock on 2 different handles may > still be doing a this_cpu_inc on the same location.. > (sp->sda->srcu_lock_count). :-( > > Paul any ideas on how to solve this? You lost me on this one. When you said "2 different handles", I assumed that you meant two different values of "sp", which would have two different addresses for >sda->srcu_lock_count. What am I missing? > It does start to seem like a show stopper :-( I suppose that an srcu_read_lock_nmi() and srcu_read_unlock_nmi() could be added, which would do atomic ops on sp->sda->srcu_lock_count. Not sure whether this would be fast enough to be useful, but easy to provide: int __srcu_read_lock_nmi(struct srcu_struct *sp) /* UNTESTED. */ { int idx; idx = READ_ONCE(sp->srcu_idx) & 0x1; atomic_inc(>sda->srcu_lock_count[idx]); smp_mb__after_atomic(); /* B */ /* Avoid leaking critical section. */ return idx; } void __srcu_read_unlock_nmi(struct srcu_struct *sp, int idx) { smp_mb__before_atomic(); /* C */ /* Avoid leaking critical section. */ atomic_inc(>sda->srcu_unlock_count[idx]); } With appropriate adjustments to also allow Tiny RCU to also work. Note that you have to use _nmi() everywhere, not just in NMI handlers. In fact, the NMI handlers are the one place you -don't- need to use _nmi(), strangely enough. Might be worth a try -- smp_mb__{before,after}_atomic() is a no-op on some architectures, for example. Thanx, Paul
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 20:53:54 -0700 Joel Fernandes wrote: > > When I talked to Paul few months ago about SRCU from NMI context, he > > mentioned the per-cpu memory operations during srcu_read_lock can be > > NMI interrupted, that's why we added that warning. > > So I looked more closely, __srcu_read_lock on 2 different handles may > still be doing a this_cpu_inc on the same location.. > (sp->sda->srcu_lock_count). :-( > > Paul any ideas on how to solve this? > > It does start to seem like a show stopper :-( What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although it wont be atomic for the capture of the idx. But I also don't see interrupts being disabled, thus an NMI is no different than any interrupt doing the same thing, right? -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 20:53:54 -0700 Joel Fernandes wrote: > > When I talked to Paul few months ago about SRCU from NMI context, he > > mentioned the per-cpu memory operations during srcu_read_lock can be > > NMI interrupted, that's why we added that warning. > > So I looked more closely, __srcu_read_lock on 2 different handles may > still be doing a this_cpu_inc on the same location.. > (sp->sda->srcu_lock_count). :-( > > Paul any ideas on how to solve this? > > It does start to seem like a show stopper :-( What's wrong with a this_cpu_inc()? It's atomic for the CPU. Although it wont be atomic for the capture of the idx. But I also don't see interrupts being disabled, thus an NMI is no different than any interrupt doing the same thing, right? -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 8:53 PM, Joel Fernandes wrote: > On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: >> Hi Steve, >> >> On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: > [...] @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); } while ((++it_func_ptr)->func);\ } \ \ - if (rcuidle)\ - srcu_read_unlock_notrace(_srcu, idx);\ + srcu_read_unlock_notrace(ss, idx); \ >>> >>> Hmm, why do we have the two different srcu handles? >> >> Because if the memory operations happening on the normal SRCU handle >> (during srcu_read_lock) is interrupted by NMI, then the other handle >> (devoted to NMI) could be used instead and not bother the interrupted >> handle. Does that makes sense? >> >> When I talked to Paul few months ago about SRCU from NMI context, he >> mentioned the per-cpu memory operations during srcu_read_lock can be >> NMI interrupted, that's why we added that warning. > > So I looked more closely, __srcu_read_lock on 2 different handles may > still be doing a this_cpu_inc on the same location.. > (sp->sda->srcu_lock_count). :-( > > Paul any ideas on how to solve this? > > It does start to seem like a show stopper :-( Steve, another thing we could do is drop "tracepoint: Make rcuidle tracepoint callers use SRCU" and just call into irqsoff and preemptoff tracer hooks directly. The reason I added "tracepoint: Make rcuidle tracepoint callers use SRCU" is because lockdep's performance dropped with existing tracepoint code and SRCU improved that. But now that you're calling directly into lockdep that shouldn't be an issue. That should resolve your NMI issues and keep my macro and other clean ups from the original patch. What do you think? I am out in the morning, but I could write this up in the evening when I get some time (unless you do it before me). thanks, - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 8:53 PM, Joel Fernandes wrote: > On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: >> Hi Steve, >> >> On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: > [...] @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); } while ((++it_func_ptr)->func);\ } \ \ - if (rcuidle)\ - srcu_read_unlock_notrace(_srcu, idx);\ + srcu_read_unlock_notrace(ss, idx); \ >>> >>> Hmm, why do we have the two different srcu handles? >> >> Because if the memory operations happening on the normal SRCU handle >> (during srcu_read_lock) is interrupted by NMI, then the other handle >> (devoted to NMI) could be used instead and not bother the interrupted >> handle. Does that makes sense? >> >> When I talked to Paul few months ago about SRCU from NMI context, he >> mentioned the per-cpu memory operations during srcu_read_lock can be >> NMI interrupted, that's why we added that warning. > > So I looked more closely, __srcu_read_lock on 2 different handles may > still be doing a this_cpu_inc on the same location.. > (sp->sda->srcu_lock_count). :-( > > Paul any ideas on how to solve this? > > It does start to seem like a show stopper :-( Steve, another thing we could do is drop "tracepoint: Make rcuidle tracepoint callers use SRCU" and just call into irqsoff and preemptoff tracer hooks directly. The reason I added "tracepoint: Make rcuidle tracepoint callers use SRCU" is because lockdep's performance dropped with existing tracepoint code and SRCU improved that. But now that you're calling directly into lockdep that shouldn't be an issue. That should resolve your NMI issues and keep my macro and other clean ups from the original patch. What do you think? I am out in the morning, but I could write this up in the evening when I get some time (unless you do it before me). thanks, - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: > Hi Steve, > > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: [...] >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); >>> } while ((++it_func_ptr)->func);\ >>> } \ >>> \ >>> - if (rcuidle)\ >>> - srcu_read_unlock_notrace(_srcu, idx);\ >>> + srcu_read_unlock_notrace(ss, idx); \ >> >> Hmm, why do we have the two different srcu handles? > > Because if the memory operations happening on the normal SRCU handle > (during srcu_read_lock) is interrupted by NMI, then the other handle > (devoted to NMI) could be used instead and not bother the interrupted > handle. Does that makes sense? > > When I talked to Paul few months ago about SRCU from NMI context, he > mentioned the per-cpu memory operations during srcu_read_lock can be > NMI interrupted, that's why we added that warning. So I looked more closely, __srcu_read_lock on 2 different handles may still be doing a this_cpu_inc on the same location.. (sp->sda->srcu_lock_count). :-( Paul any ideas on how to solve this? It does start to seem like a show stopper :-(
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 8:44 PM, Joel Fernandes wrote: > Hi Steve, > > On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: [...] >>> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); >>> } while ((++it_func_ptr)->func);\ >>> } \ >>> \ >>> - if (rcuidle)\ >>> - srcu_read_unlock_notrace(_srcu, idx);\ >>> + srcu_read_unlock_notrace(ss, idx); \ >> >> Hmm, why do we have the two different srcu handles? > > Because if the memory operations happening on the normal SRCU handle > (during srcu_read_lock) is interrupted by NMI, then the other handle > (devoted to NMI) could be used instead and not bother the interrupted > handle. Does that makes sense? > > When I talked to Paul few months ago about SRCU from NMI context, he > mentioned the per-cpu memory operations during srcu_read_lock can be > NMI interrupted, that's why we added that warning. So I looked more closely, __srcu_read_lock on 2 different handles may still be doing a this_cpu_inc on the same location.. (sp->sda->srcu_lock_count). :-( Paul any ideas on how to solve this? It does start to seem like a show stopper :-(
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
Hi Steve, On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: > On Tue, 7 Aug 2018 19:13:32 -0700 > Joel Fernandes wrote: >> > >> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 >> >> From: "Joel Fernandes (Google)" >> >> Date: Sun, 5 Aug 2018 20:17:41 -0700 >> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline >> >> >> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") >> >> causes a problem for lockdep using tracepoint code. Once a CPU is >> >> offline, tracepoints donot get called, however this causes a big problem >> >> for lockdep probes that need to run so that IRQ annotations are marked >> >> correctly. >> >> >> >> A race is possible where while the CPU is going offline, an interrupt >> >> can come in and then a lockdep assert causes an annotation warning: >> >> >> >> [ 106.551354] IRQs not enabled as expected >> >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 >> >> tick_nohz_idle_enter+0x99/0xb0 >> >> [ 106.552964] Modules linked in: >> >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW >> >> >> >> We need tracepoints to run as late as possible. This commit fixes the >> >> issue by removing the cpu_online check in tracepoint code that was >> >> introduced in the mentioned commit, however we now switch to using SRCU >> >> for all tracepoints and special handle calling tracepoints from NMI so >> >> that we don't run into issues that result from using sched-RCU when the >> >> CPUs are marked to be offline. >> >> >> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and >> >> unify their usage") >> >> Reported-by: Masami Hiramatsu >> >> Signed-off-by: Joel Fernandes (Google) >> > >> > >> > The above change log doesn't look like it matches the NMI patch. >> > >> > Can you resend with just the NMI changes? I already handled the cpu >> > offline ones. >> >> Ok, sent with "cpu offline" changes dropped, here it is: >> https://lore.kernel.org/patchwork/patch/972657/ >> >> If you could add your Reported-by to it, that would be great as well. >> >> > >> > But I may still have concerns with this patch. >> >> Ok let me know what you think. >> > > Not sure you saw this part of my reply: I missed the part on the srcu handles, sorry. > >> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); >> } while ((++it_func_ptr)->func);\ >> } \ >> \ >> - if (rcuidle)\ >> - srcu_read_unlock_notrace(_srcu, idx);\ >> + srcu_read_unlock_notrace(ss, idx); \ > > Hmm, why do we have the two different srcu handles? Because if the memory operations happening on the normal SRCU handle (during srcu_read_lock) is interrupted by NMI, then the other handle (devoted to NMI) could be used instead and not bother the interrupted handle. Does that makes sense? When I talked to Paul few months ago about SRCU from NMI context, he mentioned the per-cpu memory operations during srcu_read_lock can be NMI interrupted, that's why we added that warning. > Thinking about this, if this can be called by the "thunk" code, then > there might be an issue. I think the "thunk" code can be called before > in_nmi() is set, so we don't know if we are in an NMI or not. I need to > look at that code to make sure. If in_nmi() still works, then we should > use the _nmi srcu handle (if that's required). > > But I'm not sure using SRCU for all tracepoints is needed at this > moment. I'll have to look deeper into this tomorrow. But it's getting > close to the merge window, and this needs to be settled quick. Another > "partial revert" may be needed until this gets settled. I did read this part, yes I'm not sure either. You mentioned you would confirm that in the morning, I look forward to it. I hope the in_nmi() function is reliable to use from here.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
Hi Steve, On Tue, Aug 7, 2018 at 7:28 PM, Steven Rostedt wrote: > On Tue, 7 Aug 2018 19:13:32 -0700 > Joel Fernandes wrote: >> > >> >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 >> >> From: "Joel Fernandes (Google)" >> >> Date: Sun, 5 Aug 2018 20:17:41 -0700 >> >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline >> >> >> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") >> >> causes a problem for lockdep using tracepoint code. Once a CPU is >> >> offline, tracepoints donot get called, however this causes a big problem >> >> for lockdep probes that need to run so that IRQ annotations are marked >> >> correctly. >> >> >> >> A race is possible where while the CPU is going offline, an interrupt >> >> can come in and then a lockdep assert causes an annotation warning: >> >> >> >> [ 106.551354] IRQs not enabled as expected >> >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 >> >> tick_nohz_idle_enter+0x99/0xb0 >> >> [ 106.552964] Modules linked in: >> >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW >> >> >> >> We need tracepoints to run as late as possible. This commit fixes the >> >> issue by removing the cpu_online check in tracepoint code that was >> >> introduced in the mentioned commit, however we now switch to using SRCU >> >> for all tracepoints and special handle calling tracepoints from NMI so >> >> that we don't run into issues that result from using sched-RCU when the >> >> CPUs are marked to be offline. >> >> >> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and >> >> unify their usage") >> >> Reported-by: Masami Hiramatsu >> >> Signed-off-by: Joel Fernandes (Google) >> > >> > >> > The above change log doesn't look like it matches the NMI patch. >> > >> > Can you resend with just the NMI changes? I already handled the cpu >> > offline ones. >> >> Ok, sent with "cpu offline" changes dropped, here it is: >> https://lore.kernel.org/patchwork/patch/972657/ >> >> If you could add your Reported-by to it, that would be great as well. >> >> > >> > But I may still have concerns with this patch. >> >> Ok let me know what you think. >> > > Not sure you saw this part of my reply: I missed the part on the srcu handles, sorry. > >> @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); >> } while ((++it_func_ptr)->func);\ >> } \ >> \ >> - if (rcuidle)\ >> - srcu_read_unlock_notrace(_srcu, idx);\ >> + srcu_read_unlock_notrace(ss, idx); \ > > Hmm, why do we have the two different srcu handles? Because if the memory operations happening on the normal SRCU handle (during srcu_read_lock) is interrupted by NMI, then the other handle (devoted to NMI) could be used instead and not bother the interrupted handle. Does that makes sense? When I talked to Paul few months ago about SRCU from NMI context, he mentioned the per-cpu memory operations during srcu_read_lock can be NMI interrupted, that's why we added that warning. > Thinking about this, if this can be called by the "thunk" code, then > there might be an issue. I think the "thunk" code can be called before > in_nmi() is set, so we don't know if we are in an NMI or not. I need to > look at that code to make sure. If in_nmi() still works, then we should > use the _nmi srcu handle (if that's required). > > But I'm not sure using SRCU for all tracepoints is needed at this > moment. I'll have to look deeper into this tomorrow. But it's getting > close to the merge window, and this needs to be settled quick. Another > "partial revert" may be needed until this gets settled. I did read this part, yes I'm not sure either. You mentioned you would confirm that in the morning, I look forward to it. I hope the in_nmi() function is reliable to use from here.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 19:13:32 -0700 Joel Fernandes wrote: > On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt wrote: > > On Tue, 7 Aug 2018 18:17:42 -0700 > > Joel Fernandes wrote: > > > >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 > >> From: "Joel Fernandes (Google)" > >> Date: Sun, 5 Aug 2018 20:17:41 -0700 > >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline > >> > >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") > >> causes a problem for lockdep using tracepoint code. Once a CPU is > >> offline, tracepoints donot get called, however this causes a big problem > >> for lockdep probes that need to run so that IRQ annotations are marked > >> correctly. > >> > >> A race is possible where while the CPU is going offline, an interrupt > >> can come in and then a lockdep assert causes an annotation warning: > >> > >> [ 106.551354] IRQs not enabled as expected > >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 > >> tick_nohz_idle_enter+0x99/0xb0 > >> [ 106.552964] Modules linked in: > >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW > >> > >> We need tracepoints to run as late as possible. This commit fixes the > >> issue by removing the cpu_online check in tracepoint code that was > >> introduced in the mentioned commit, however we now switch to using SRCU > >> for all tracepoints and special handle calling tracepoints from NMI so > >> that we don't run into issues that result from using sched-RCU when the > >> CPUs are marked to be offline. > >> > >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and > >> unify their usage") > >> Reported-by: Masami Hiramatsu > >> Signed-off-by: Joel Fernandes (Google) > > > > > > The above change log doesn't look like it matches the NMI patch. > > > > Can you resend with just the NMI changes? I already handled the cpu > > offline ones. > > Ok, sent with "cpu offline" changes dropped, here it is: > https://lore.kernel.org/patchwork/patch/972657/ > > If you could add your Reported-by to it, that would be great as well. > > > > > But I may still have concerns with this patch. > > Ok let me know what you think. > Not sure you saw this part of my reply: > @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > } while ((++it_func_ptr)->func);\ > } \ > \ > - if (rcuidle)\ > - srcu_read_unlock_notrace(_srcu, idx);\ > + srcu_read_unlock_notrace(ss, idx); \ Hmm, why do we have the two different srcu handles? Thinking about this, if this can be called by the "thunk" code, then there might be an issue. I think the "thunk" code can be called before in_nmi() is set, so we don't know if we are in an NMI or not. I need to look at that code to make sure. If in_nmi() still works, then we should use the _nmi srcu handle (if that's required). But I'm not sure using SRCU for all tracepoints is needed at this moment. I'll have to look deeper into this tomorrow. But it's getting close to the merge window, and this needs to be settled quick. Another "partial revert" may be needed until this gets settled. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 19:13:32 -0700 Joel Fernandes wrote: > On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt wrote: > > On Tue, 7 Aug 2018 18:17:42 -0700 > > Joel Fernandes wrote: > > > >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 > >> From: "Joel Fernandes (Google)" > >> Date: Sun, 5 Aug 2018 20:17:41 -0700 > >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline > >> > >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") > >> causes a problem for lockdep using tracepoint code. Once a CPU is > >> offline, tracepoints donot get called, however this causes a big problem > >> for lockdep probes that need to run so that IRQ annotations are marked > >> correctly. > >> > >> A race is possible where while the CPU is going offline, an interrupt > >> can come in and then a lockdep assert causes an annotation warning: > >> > >> [ 106.551354] IRQs not enabled as expected > >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 > >> tick_nohz_idle_enter+0x99/0xb0 > >> [ 106.552964] Modules linked in: > >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW > >> > >> We need tracepoints to run as late as possible. This commit fixes the > >> issue by removing the cpu_online check in tracepoint code that was > >> introduced in the mentioned commit, however we now switch to using SRCU > >> for all tracepoints and special handle calling tracepoints from NMI so > >> that we don't run into issues that result from using sched-RCU when the > >> CPUs are marked to be offline. > >> > >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and > >> unify their usage") > >> Reported-by: Masami Hiramatsu > >> Signed-off-by: Joel Fernandes (Google) > > > > > > The above change log doesn't look like it matches the NMI patch. > > > > Can you resend with just the NMI changes? I already handled the cpu > > offline ones. > > Ok, sent with "cpu offline" changes dropped, here it is: > https://lore.kernel.org/patchwork/patch/972657/ > > If you could add your Reported-by to it, that would be great as well. > > > > > But I may still have concerns with this patch. > > Ok let me know what you think. > Not sure you saw this part of my reply: > @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > } while ((++it_func_ptr)->func);\ > } \ > \ > - if (rcuidle)\ > - srcu_read_unlock_notrace(_srcu, idx);\ > + srcu_read_unlock_notrace(ss, idx); \ Hmm, why do we have the two different srcu handles? Thinking about this, if this can be called by the "thunk" code, then there might be an issue. I think the "thunk" code can be called before in_nmi() is set, so we don't know if we are in an NMI or not. I need to look at that code to make sure. If in_nmi() still works, then we should use the _nmi srcu handle (if that's required). But I'm not sure using SRCU for all tracepoints is needed at this moment. I'll have to look deeper into this tomorrow. But it's getting close to the merge window, and this needs to be settled quick. Another "partial revert" may be needed until this gets settled. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt wrote: > On Tue, 7 Aug 2018 18:17:42 -0700 > Joel Fernandes wrote: > >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 >> From: "Joel Fernandes (Google)" >> Date: Sun, 5 Aug 2018 20:17:41 -0700 >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline >> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") >> causes a problem for lockdep using tracepoint code. Once a CPU is >> offline, tracepoints donot get called, however this causes a big problem >> for lockdep probes that need to run so that IRQ annotations are marked >> correctly. >> >> A race is possible where while the CPU is going offline, an interrupt >> can come in and then a lockdep assert causes an annotation warning: >> >> [ 106.551354] IRQs not enabled as expected >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 >> tick_nohz_idle_enter+0x99/0xb0 >> [ 106.552964] Modules linked in: >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW >> >> We need tracepoints to run as late as possible. This commit fixes the >> issue by removing the cpu_online check in tracepoint code that was >> introduced in the mentioned commit, however we now switch to using SRCU >> for all tracepoints and special handle calling tracepoints from NMI so >> that we don't run into issues that result from using sched-RCU when the >> CPUs are marked to be offline. >> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and >> unify their usage") >> Reported-by: Masami Hiramatsu >> Signed-off-by: Joel Fernandes (Google) > > > The above change log doesn't look like it matches the NMI patch. > > Can you resend with just the NMI changes? I already handled the cpu > offline ones. Ok, sent with "cpu offline" changes dropped, here it is: https://lore.kernel.org/patchwork/patch/972657/ If you could add your Reported-by to it, that would be great as well. > > But I may still have concerns with this patch. Ok let me know what you think. Thanks! - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 6:55 PM, Steven Rostedt wrote: > On Tue, 7 Aug 2018 18:17:42 -0700 > Joel Fernandes wrote: > >> From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 >> From: "Joel Fernandes (Google)" >> Date: Sun, 5 Aug 2018 20:17:41 -0700 >> Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline >> >> Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") >> causes a problem for lockdep using tracepoint code. Once a CPU is >> offline, tracepoints donot get called, however this causes a big problem >> for lockdep probes that need to run so that IRQ annotations are marked >> correctly. >> >> A race is possible where while the CPU is going offline, an interrupt >> can come in and then a lockdep assert causes an annotation warning: >> >> [ 106.551354] IRQs not enabled as expected >> [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 >> tick_nohz_idle_enter+0x99/0xb0 >> [ 106.552964] Modules linked in: >> [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW >> >> We need tracepoints to run as late as possible. This commit fixes the >> issue by removing the cpu_online check in tracepoint code that was >> introduced in the mentioned commit, however we now switch to using SRCU >> for all tracepoints and special handle calling tracepoints from NMI so >> that we don't run into issues that result from using sched-RCU when the >> CPUs are marked to be offline. >> >> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and >> unify their usage") >> Reported-by: Masami Hiramatsu >> Signed-off-by: Joel Fernandes (Google) > > > The above change log doesn't look like it matches the NMI patch. > > Can you resend with just the NMI changes? I already handled the cpu > offline ones. Ok, sent with "cpu offline" changes dropped, here it is: https://lore.kernel.org/patchwork/patch/972657/ If you could add your Reported-by to it, that would be great as well. > > But I may still have concerns with this patch. Ok let me know what you think. Thanks! - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 18:17:42 -0700 Joel Fernandes wrote: > From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 > From: "Joel Fernandes (Google)" > Date: Sun, 5 Aug 2018 20:17:41 -0700 > Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline > > Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") > causes a problem for lockdep using tracepoint code. Once a CPU is > offline, tracepoints donot get called, however this causes a big problem > for lockdep probes that need to run so that IRQ annotations are marked > correctly. > > A race is possible where while the CPU is going offline, an interrupt > can come in and then a lockdep assert causes an annotation warning: > > [ 106.551354] IRQs not enabled as expected > [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 > tick_nohz_idle_enter+0x99/0xb0 > [ 106.552964] Modules linked in: > [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW > > We need tracepoints to run as late as possible. This commit fixes the > issue by removing the cpu_online check in tracepoint code that was > introduced in the mentioned commit, however we now switch to using SRCU > for all tracepoints and special handle calling tracepoints from NMI so > that we don't run into issues that result from using sched-RCU when the > CPUs are marked to be offline. > > Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and > unify their usage") > Reported-by: Masami Hiramatsu > Signed-off-by: Joel Fernandes (Google) The above change log doesn't look like it matches the NMI patch. Can you resend with just the NMI changes? I already handled the cpu offline ones. But I may still have concerns with this patch. --- > include/linux/tracepoint.h | 27 +++ > kernel/tracepoint.c| 10 ++ > 2 files changed, 17 insertions(+), 20 deletions(-) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index d9a084c72541..5733502bb3ce 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -35,6 +35,7 @@ struct trace_eval_map { > #define TRACEPOINT_DEFAULT_PRIO 10 > > extern struct srcu_struct tracepoint_srcu; > +extern struct srcu_struct tracepoint_srcu_nmi; > > extern int > tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); > @@ -144,13 +145,11 @@ extern void syscall_unregfunc(void); > void *it_func; \ > void *__data; \ > int __maybe_unused idx = 0; \ > + struct srcu_struct *ss; \ > \ > if (!(cond))\ > return; \ > \ > - /* srcu can't be used from NMI */ \ > - WARN_ON_ONCE(rcuidle && in_nmi()); \ > - \ > /* keep srcu and sched-rcu usage consistent */ \ > preempt_disable_notrace(); \ > \ > @@ -159,7 +158,11 @@ extern void syscall_unregfunc(void); >* doesn't work from the idle path. \ >*/ \ > if (rcuidle)\ > - idx = srcu_read_lock_notrace(_srcu); \ > + ss = _srcu_nmi; \ > + else\ > + ss = _srcu; \ > + \ > + idx = srcu_read_lock_notrace(ss); \ > \ > it_func_ptr = rcu_dereference_raw((tp)->funcs); \ > \ > @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > } while ((++it_func_ptr)->func);\ > } \ > \ > - if (rcuidle)\ > - srcu_read_unlock_notrace(_srcu, idx);\ > + srcu_read_unlock_notrace(ss, idx); \ Hmm, why do we have the two different srcu
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 18:17:42 -0700 Joel Fernandes wrote: > From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 > From: "Joel Fernandes (Google)" > Date: Sun, 5 Aug 2018 20:17:41 -0700 > Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline > > Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") > causes a problem for lockdep using tracepoint code. Once a CPU is > offline, tracepoints donot get called, however this causes a big problem > for lockdep probes that need to run so that IRQ annotations are marked > correctly. > > A race is possible where while the CPU is going offline, an interrupt > can come in and then a lockdep assert causes an annotation warning: > > [ 106.551354] IRQs not enabled as expected > [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 > tick_nohz_idle_enter+0x99/0xb0 > [ 106.552964] Modules linked in: > [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW > > We need tracepoints to run as late as possible. This commit fixes the > issue by removing the cpu_online check in tracepoint code that was > introduced in the mentioned commit, however we now switch to using SRCU > for all tracepoints and special handle calling tracepoints from NMI so > that we don't run into issues that result from using sched-RCU when the > CPUs are marked to be offline. > > Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and > unify their usage") > Reported-by: Masami Hiramatsu > Signed-off-by: Joel Fernandes (Google) The above change log doesn't look like it matches the NMI patch. Can you resend with just the NMI changes? I already handled the cpu offline ones. But I may still have concerns with this patch. --- > include/linux/tracepoint.h | 27 +++ > kernel/tracepoint.c| 10 ++ > 2 files changed, 17 insertions(+), 20 deletions(-) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index d9a084c72541..5733502bb3ce 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -35,6 +35,7 @@ struct trace_eval_map { > #define TRACEPOINT_DEFAULT_PRIO 10 > > extern struct srcu_struct tracepoint_srcu; > +extern struct srcu_struct tracepoint_srcu_nmi; > > extern int > tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); > @@ -144,13 +145,11 @@ extern void syscall_unregfunc(void); > void *it_func; \ > void *__data; \ > int __maybe_unused idx = 0; \ > + struct srcu_struct *ss; \ > \ > if (!(cond))\ > return; \ > \ > - /* srcu can't be used from NMI */ \ > - WARN_ON_ONCE(rcuidle && in_nmi()); \ > - \ > /* keep srcu and sched-rcu usage consistent */ \ > preempt_disable_notrace(); \ > \ > @@ -159,7 +158,11 @@ extern void syscall_unregfunc(void); >* doesn't work from the idle path. \ >*/ \ > if (rcuidle)\ > - idx = srcu_read_lock_notrace(_srcu); \ > + ss = _srcu_nmi; \ > + else\ > + ss = _srcu; \ > + \ > + idx = srcu_read_lock_notrace(ss); \ > \ > it_func_ptr = rcu_dereference_raw((tp)->funcs); \ > \ > @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); > } while ((++it_func_ptr)->func);\ > } \ > \ > - if (rcuidle)\ > - srcu_read_unlock_notrace(_srcu, idx);\ > + srcu_read_unlock_notrace(ss, idx); \ Hmm, why do we have the two different srcu
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 5:48 PM, Steven Rostedt wrote: > On Tue, 07 Aug 2018 19:54:13 -0400 > Joel Fernandes wrote: > > >> >OK, I hit this bug, but it's not because of the partial revert. This >> >bug seems it needs to be another partial revert. I like you movement of >> >the code, but I'm starting to doubt that we can use a trace event as a >> >hook for critical areas yet. Well, not until we can use srcu in NMI. >> > >> >> I sent a patch to use srcu for all tracepoints including nmi. That >> patch also removes this warning and fixes the one other issue masami >> reported (hot plug causing a warning). > > Is it one I can take, or does Paul have it? I'll need it to continue > with this as is, because I can't pass my tests without triggering that > warning (and that fails the test). I think you could take it if you're Ok with it, its purely in the tracing code and I tested it yesterday morning. Its attached to this email. I tested that it fixes the mmio trace (hotplug related) issue.. >> >> If Paul and Mathieu can confirm SRCU works on offline CPUs that would >> be great. > > Yes please. > BTW I found this in Paul's docs RCU Requirements docs, "SRCU is insensitive to whether or not a CPU is online" so I think it will work. The paragraph was.. Also unlike other RCU flavors, SRCU's callbacks-wait function srcu_barrier() may be invoked from CPU-hotplug notifiers, though this is not necessarily a good idea. The reason that this is possible is that SRCU is insensitive to whether or not a CPU is online, which means that srcu_barrier() need not exclude CPU-hotplug operations. >> >> It's just this one warning or anything else that makes you feel >> tracepoints for critical paths isn't feasible? > > Currently I'm down to this, but I can't get past my first test in my > ktest suite, because it fails here. Ok hopefully this will get you past that. Sorry for the frustration. From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 From: "Joel Fernandes (Google)" Date: Sun, 5 Aug 2018 20:17:41 -0700 Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") causes a problem for lockdep using tracepoint code. Once a CPU is offline, tracepoints donot get called, however this causes a big problem for lockdep probes that need to run so that IRQ annotations are marked correctly. A race is possible where while the CPU is going offline, an interrupt can come in and then a lockdep assert causes an annotation warning: [ 106.551354] IRQs not enabled as expected [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 tick_nohz_idle_enter+0x99/0xb0 [ 106.552964] Modules linked in: [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW We need tracepoints to run as late as possible. This commit fixes the issue by removing the cpu_online check in tracepoint code that was introduced in the mentioned commit, however we now switch to using SRCU for all tracepoints and special handle calling tracepoints from NMI so that we don't run into issues that result from using sched-RCU when the CPUs are marked to be offline. Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Reported-by: Masami Hiramatsu Signed-off-by: Joel Fernandes (Google) --- include/linux/tracepoint.h | 27 +++ kernel/tracepoint.c| 10 ++ 2 files changed, 17 insertions(+), 20 deletions(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index d9a084c72541..5733502bb3ce 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -35,6 +35,7 @@ struct trace_eval_map { #define TRACEPOINT_DEFAULT_PRIO 10 extern struct srcu_struct tracepoint_srcu; +extern struct srcu_struct tracepoint_srcu_nmi; extern int tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); @@ -144,13 +145,11 @@ extern void syscall_unregfunc(void); void *it_func; \ void *__data; \ int __maybe_unused idx = 0;\ + struct srcu_struct *ss; \ \ if (!(cond)) \ return; \ \ - /* srcu can't be used from NMI */ \ - WARN_ON_ONCE(rcuidle && in_nmi()); \ - \ /* keep srcu and sched-rcu usage consistent */ \ preempt_disable_notrace();\ \ @@ -159,7 +158,11 @@ extern void syscall_unregfunc(void); * doesn't work from the idle path. \ */ \ if (rcuidle) \ - idx = srcu_read_lock_notrace(_srcu); \ + ss = _srcu_nmi; \ + else \ + ss = _srcu;\ + \ + idx = srcu_read_lock_notrace(ss); \ \ it_func_ptr = rcu_dereference_raw((tp)->funcs); \ \ @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); } while ((++it_func_ptr)->func); \ } \ \ - if (rcuidle) \ - srcu_read_unlock_notrace(_srcu, idx);\ +
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, Aug 7, 2018 at 5:48 PM, Steven Rostedt wrote: > On Tue, 07 Aug 2018 19:54:13 -0400 > Joel Fernandes wrote: > > >> >OK, I hit this bug, but it's not because of the partial revert. This >> >bug seems it needs to be another partial revert. I like you movement of >> >the code, but I'm starting to doubt that we can use a trace event as a >> >hook for critical areas yet. Well, not until we can use srcu in NMI. >> > >> >> I sent a patch to use srcu for all tracepoints including nmi. That >> patch also removes this warning and fixes the one other issue masami >> reported (hot plug causing a warning). > > Is it one I can take, or does Paul have it? I'll need it to continue > with this as is, because I can't pass my tests without triggering that > warning (and that fails the test). I think you could take it if you're Ok with it, its purely in the tracing code and I tested it yesterday morning. Its attached to this email. I tested that it fixes the mmio trace (hotplug related) issue.. >> >> If Paul and Mathieu can confirm SRCU works on offline CPUs that would >> be great. > > Yes please. > BTW I found this in Paul's docs RCU Requirements docs, "SRCU is insensitive to whether or not a CPU is online" so I think it will work. The paragraph was.. Also unlike other RCU flavors, SRCU's callbacks-wait function srcu_barrier() may be invoked from CPU-hotplug notifiers, though this is not necessarily a good idea. The reason that this is possible is that SRCU is insensitive to whether or not a CPU is online, which means that srcu_barrier() need not exclude CPU-hotplug operations. >> >> It's just this one warning or anything else that makes you feel >> tracepoints for critical paths isn't feasible? > > Currently I'm down to this, but I can't get past my first test in my > ktest suite, because it fails here. Ok hopefully this will get you past that. Sorry for the frustration. From 6986af946ceb04fc9ddc6d5b45fc559b6807e465 Mon Sep 17 00:00:00 2001 From: "Joel Fernandes (Google)" Date: Sun, 5 Aug 2018 20:17:41 -0700 Subject: [PATCH] tracepoint: Run tracepoints even after CPU is offline Commit f37755490fe9 ("tracepoints: Do not trace when cpu is offline") causes a problem for lockdep using tracepoint code. Once a CPU is offline, tracepoints donot get called, however this causes a big problem for lockdep probes that need to run so that IRQ annotations are marked correctly. A race is possible where while the CPU is going offline, an interrupt can come in and then a lockdep assert causes an annotation warning: [ 106.551354] IRQs not enabled as expected [ 106.551785] WARNING: CPU: 1 PID: 0 at kernel/time/tick-sched.c:982 tick_nohz_idle_enter+0x99/0xb0 [ 106.552964] Modules linked in: [ 106.553299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GW We need tracepoints to run as late as possible. This commit fixes the issue by removing the cpu_online check in tracepoint code that was introduced in the mentioned commit, however we now switch to using SRCU for all tracepoints and special handle calling tracepoints from NMI so that we don't run into issues that result from using sched-RCU when the CPUs are marked to be offline. Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage") Reported-by: Masami Hiramatsu Signed-off-by: Joel Fernandes (Google) --- include/linux/tracepoint.h | 27 +++ kernel/tracepoint.c| 10 ++ 2 files changed, 17 insertions(+), 20 deletions(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index d9a084c72541..5733502bb3ce 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -35,6 +35,7 @@ struct trace_eval_map { #define TRACEPOINT_DEFAULT_PRIO 10 extern struct srcu_struct tracepoint_srcu; +extern struct srcu_struct tracepoint_srcu_nmi; extern int tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data); @@ -144,13 +145,11 @@ extern void syscall_unregfunc(void); void *it_func; \ void *__data; \ int __maybe_unused idx = 0;\ + struct srcu_struct *ss; \ \ if (!(cond)) \ return; \ \ - /* srcu can't be used from NMI */ \ - WARN_ON_ONCE(rcuidle && in_nmi()); \ - \ /* keep srcu and sched-rcu usage consistent */ \ preempt_disable_notrace();\ \ @@ -159,7 +158,11 @@ extern void syscall_unregfunc(void); * doesn't work from the idle path. \ */ \ if (rcuidle) \ - idx = srcu_read_lock_notrace(_srcu); \ + ss = _srcu_nmi; \ + else \ + ss = _srcu;\ + \ + idx = srcu_read_lock_notrace(ss); \ \ it_func_ptr = rcu_dereference_raw((tp)->funcs); \ \ @@ -171,8 +174,7 @@ extern void syscall_unregfunc(void); } while ((++it_func_ptr)->func); \ } \ \ - if (rcuidle) \ - srcu_read_unlock_notrace(_srcu, idx);\ +
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 19:54:13 -0400 Joel Fernandes wrote: > >OK, I hit this bug, but it's not because of the partial revert. This > >bug seems it needs to be another partial revert. I like you movement of > >the code, but I'm starting to doubt that we can use a trace event as a > >hook for critical areas yet. Well, not until we can use srcu in NMI. > > > > I sent a patch to use srcu for all tracepoints including nmi. That > patch also removes this warning and fixes the one other issue masami > reported (hot plug causing a warning). Is it one I can take, or does Paul have it? I'll need it to continue with this as is, because I can't pass my tests without triggering that warning (and that fails the test). > > If Paul and Mathieu can confirm SRCU works on offline CPUs that would > be great. Yes please. > > It's just this one warning or anything else that makes you feel > tracepoints for critical paths isn't feasible? Currently I'm down to this, but I can't get past my first test in my ktest suite, because it fails here. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 19:54:13 -0400 Joel Fernandes wrote: > >OK, I hit this bug, but it's not because of the partial revert. This > >bug seems it needs to be another partial revert. I like you movement of > >the code, but I'm starting to doubt that we can use a trace event as a > >hook for critical areas yet. Well, not until we can use srcu in NMI. > > > > I sent a patch to use srcu for all tracepoints including nmi. That > patch also removes this warning and fixes the one other issue masami > reported (hot plug causing a warning). Is it one I can take, or does Paul have it? I'll need it to continue with this as is, because I can't pass my tests without triggering that warning (and that fails the test). > > If Paul and Mathieu can confirm SRCU works on offline CPUs that would > be great. Yes please. > > It's just this one warning or anything else that makes you feel > tracepoints for critical paths isn't feasible? Currently I'm down to this, but I can't get past my first test in my ktest suite, because it fails here. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 7:45:15 PM EDT, Steven Rostedt wrote: >On Tue, 07 Aug 2018 11:24:13 -0400 >Joel Fernandes wrote: > >> On August 7, 2018 11:09:06 AM EDT, Steven Rostedt > wrote: >> >On Tue, 07 Aug 2018 10:48:05 -0400 >> >Joel Fernandes wrote: >> > >> >> >You mean if someone add a tracepoint callback to the irq disable >> >> >tracepoint, and did a lockdep assert to make sure interrupts are >> >> >disabled? >> >> >> >> Yes that's what I meant. >> > >> >That sounds like a "Doctor, it hurts me when I do this" problem ;-) > >> >> Haha, yes true. But just to clarify, I didn't do this to see the >problem but noticed it with turning on existing things. :-) but I see >your point... >> > >OK, I hit this bug, but it's not because of the partial revert. This >bug seems it needs to be another partial revert. I like you movement of >the code, but I'm starting to doubt that we can use a trace event as a >hook for critical areas yet. Well, not until we can use srcu in NMI. > I sent a patch to use srcu for all tracepoints including nmi. That patch also removes this warning and fixes the one other issue masami reported (hot plug causing a warning). If Paul and Mathieu can confirm SRCU works on offline CPUs that would be great. It's just this one warning or anything else that makes you feel tracepoints for critical paths isn't feasible? Thanks. >#define __DO_TRACE(tp, proto, args, cond, rcuidle) \ > do {\ > struct tracepoint_func *it_func_ptr;\ > void *it_func; \ > void *__data; \ > int __maybe_unused idx = 0; \ > \ > if (!(cond))\ > return; \ > \ > /* srcu can't be used from NMI */ \ > > WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit! > >WARNING: CPU: 3 PID: 3727 at >/work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38 >trace_irq_disable_rcuidle+0x2a/0x6c >Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 >nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev >parport_pc parport >CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14 >Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 >EIP: trace_irq_disable_rcuidle+0x2a/0x6c >Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55 >55 89 e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89 >d7 89 c6 b8 e0 d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 >EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3 >ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0 >DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047 >CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0 >Call Trace: > trace_hardirqs_off_caller+0x23/0x2d > trace_hardirqs_off_thunk+0xc/0x10 >EIP: default_do_nmi+0x1/0x157 > > >-- Steve -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 7:45:15 PM EDT, Steven Rostedt wrote: >On Tue, 07 Aug 2018 11:24:13 -0400 >Joel Fernandes wrote: > >> On August 7, 2018 11:09:06 AM EDT, Steven Rostedt > wrote: >> >On Tue, 07 Aug 2018 10:48:05 -0400 >> >Joel Fernandes wrote: >> > >> >> >You mean if someone add a tracepoint callback to the irq disable >> >> >tracepoint, and did a lockdep assert to make sure interrupts are >> >> >disabled? >> >> >> >> Yes that's what I meant. >> > >> >That sounds like a "Doctor, it hurts me when I do this" problem ;-) > >> >> Haha, yes true. But just to clarify, I didn't do this to see the >problem but noticed it with turning on existing things. :-) but I see >your point... >> > >OK, I hit this bug, but it's not because of the partial revert. This >bug seems it needs to be another partial revert. I like you movement of >the code, but I'm starting to doubt that we can use a trace event as a >hook for critical areas yet. Well, not until we can use srcu in NMI. > I sent a patch to use srcu for all tracepoints including nmi. That patch also removes this warning and fixes the one other issue masami reported (hot plug causing a warning). If Paul and Mathieu can confirm SRCU works on offline CPUs that would be great. It's just this one warning or anything else that makes you feel tracepoints for critical paths isn't feasible? Thanks. >#define __DO_TRACE(tp, proto, args, cond, rcuidle) \ > do {\ > struct tracepoint_func *it_func_ptr;\ > void *it_func; \ > void *__data; \ > int __maybe_unused idx = 0; \ > \ > if (!(cond))\ > return; \ > \ > /* srcu can't be used from NMI */ \ > > WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit! > >WARNING: CPU: 3 PID: 3727 at >/work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38 >trace_irq_disable_rcuidle+0x2a/0x6c >Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 >nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev >parport_pc parport >CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14 >Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 >EIP: trace_irq_disable_rcuidle+0x2a/0x6c >Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55 >55 89 e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89 >d7 89 c6 b8 e0 d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 >EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3 >ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0 >DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047 >CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0 >Call Trace: > trace_hardirqs_off_caller+0x23/0x2d > trace_hardirqs_off_thunk+0xc/0x10 >EIP: default_do_nmi+0x1/0x157 > > >-- Steve -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 11:24:13 -0400 Joel Fernandes wrote: > On August 7, 2018 11:09:06 AM EDT, Steven Rostedt wrote: > >On Tue, 07 Aug 2018 10:48:05 -0400 > >Joel Fernandes wrote: > > > >> >You mean if someone add a tracepoint callback to the irq disable > >> >tracepoint, and did a lockdep assert to make sure interrupts are > >> >disabled? > >> > >> Yes that's what I meant. > > > >That sounds like a "Doctor, it hurts me when I do this" problem ;-) > > Haha, yes true. But just to clarify, I didn't do this to see the problem but > noticed it with turning on existing things. :-) but I see your point... > OK, I hit this bug, but it's not because of the partial revert. This bug seems it needs to be another partial revert. I like you movement of the code, but I'm starting to doubt that we can use a trace event as a hook for critical areas yet. Well, not until we can use srcu in NMI. #define __DO_TRACE(tp, proto, args, cond, rcuidle) \ do {\ struct tracepoint_func *it_func_ptr;\ void *it_func; \ void *__data; \ int __maybe_unused idx = 0; \ \ if (!(cond))\ return; \ \ /* srcu can't be used from NMI */ \ WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit! WARNING: CPU: 3 PID: 3727 at /work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38 trace_irq_disable_rcuidle+0x2a/0x6c Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev parport_pc parport CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 EIP: trace_irq_disable_rcuidle+0x2a/0x6c Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55 55 89 e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89 d7 89 c6 b8 e0 d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3 ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047 CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0 Call Trace: trace_hardirqs_off_caller+0x23/0x2d trace_hardirqs_off_thunk+0xc/0x10 EIP: default_do_nmi+0x1/0x157 -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 11:24:13 -0400 Joel Fernandes wrote: > On August 7, 2018 11:09:06 AM EDT, Steven Rostedt wrote: > >On Tue, 07 Aug 2018 10:48:05 -0400 > >Joel Fernandes wrote: > > > >> >You mean if someone add a tracepoint callback to the irq disable > >> >tracepoint, and did a lockdep assert to make sure interrupts are > >> >disabled? > >> > >> Yes that's what I meant. > > > >That sounds like a "Doctor, it hurts me when I do this" problem ;-) > > Haha, yes true. But just to clarify, I didn't do this to see the problem but > noticed it with turning on existing things. :-) but I see your point... > OK, I hit this bug, but it's not because of the partial revert. This bug seems it needs to be another partial revert. I like you movement of the code, but I'm starting to doubt that we can use a trace event as a hook for critical areas yet. Well, not until we can use srcu in NMI. #define __DO_TRACE(tp, proto, args, cond, rcuidle) \ do {\ struct tracepoint_func *it_func_ptr;\ void *it_func; \ void *__data; \ int __maybe_unused idx = 0; \ \ if (!(cond))\ return; \ \ /* srcu can't be used from NMI */ \ WARN_ON_ONCE(rcuidle && in_nmi()); <== WARN_ON_ONCE hit! WARNING: CPU: 3 PID: 3727 at /work/build/trace/nobackup/linux-test.git/include/trace/events/preemptirq.h:38 trace_irq_disable_rcuidle+0x2a/0x6c Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipv6 crc_ccitt r8169 ppdev parport_pc parport CPU: 3 PID: 3727 Comm: trace-cmd Not tainted 4.18.0-rc6-test+ #14 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 EIP: trace_irq_disable_rcuidle+0x2a/0x6c Code: e9 01 00 00 00 c3 64 8b 0d 24 e1 50 c1 0f a3 0d e0 f1 3f c1 73 55 55 89 e5 57 56 53 51 64 8b 0d cc 37 51 c1 0f ba e1 14 73 02 <0f> 0b 89 d7 89 c6 b8 e0 d8 2e c1 e8 8e 5b fa ff 8b 1d 9c 27 3d c1 EAX: c0401509 EBX: efa43680 ECX: 8011 EDX: c0dc81f3 ESI: ed823d44 EDI: efa43680 EBP: ed823cd0 ESP: ed823cc0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010047 CR0: 80050033 CR2: b7f06000 CR3: 30513b00 CR4: 001406f0 Call Trace: trace_hardirqs_off_caller+0x23/0x2d trace_hardirqs_off_thunk+0xc/0x10 EIP: default_do_nmi+0x1/0x157 -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 11:09:06 AM EDT, Steven Rostedt wrote: >On Tue, 07 Aug 2018 10:48:05 -0400 >Joel Fernandes wrote: > >> >You mean if someone add a tracepoint callback to the irq disable >> >tracepoint, and did a lockdep assert to make sure interrupts are >> >disabled? >> >> Yes that's what I meant. > >That sounds like a "Doctor, it hurts me when I do this" problem ;-) Haha, yes true. But just to clarify, I didn't do this to see the problem but noticed it with turning on existing things. :-) but I see your point... Thanks, - Joel -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 11:09:06 AM EDT, Steven Rostedt wrote: >On Tue, 07 Aug 2018 10:48:05 -0400 >Joel Fernandes wrote: > >> >You mean if someone add a tracepoint callback to the irq disable >> >tracepoint, and did a lockdep assert to make sure interrupts are >> >disabled? >> >> Yes that's what I meant. > >That sounds like a "Doctor, it hurts me when I do this" problem ;-) Haha, yes true. But just to clarify, I didn't do this to see the problem but noticed it with turning on existing things. :-) but I see your point... Thanks, - Joel -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 10:48:05 -0400 Joel Fernandes wrote: > >You mean if someone add a tracepoint callback to the irq disable > >tracepoint, and did a lockdep assert to make sure interrupts are > >disabled? > > Yes that's what I meant. That sounds like a "Doctor, it hurts me when I do this" problem ;-) -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 10:48:05 -0400 Joel Fernandes wrote: > >You mean if someone add a tracepoint callback to the irq disable > >tracepoint, and did a lockdep assert to make sure interrupts are > >disabled? > > Yes that's what I meant. That sounds like a "Doctor, it hurts me when I do this" problem ;-) -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 10:34:10 AM EDT, Steven Rostedt wrote: >On Tue, 07 Aug 2018 10:10:59 -0400 >Joel Fernandes wrote: > >> On August 7, 2018 9:49:54 AM EDT, Steven Rostedt > wrote: >> >On Tue, 7 Aug 2018 06:33:35 -0700 >> >Joel Fernandes wrote: >> > >> >> Thanks, also one more thing I noticed in your patch, >> >> lockdep_hardirqs_off needs to be called before all other probes >but >> >> you're calling it after. This is why I registered it with INT_MAX: >> >> >> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, >INT_MAX); >> >> >> >> Without it you may get annotation warnings. Thanks, >> > >> >Interesting. I was following the old way where we called the tracing >> >code before calling the lockdep code (all hard coded and not from >> >trace events). Is this have something to do with calling the code >from >> >a tracepoint? >> > >> >Do you have an example that could trigger the warning? >> > >> >> I remember the warnings but can't remember now how I triggered them. >> I think I saw them with the irqsoff tracer or irq trace events >> running, with lockdep turned on. > >I'll see if I can trigger it. I'll run this all through my tests. Ok. >> Also an irq disable probe that does a lockdep assert that irqs are >> disabled could trigger it? >> > >You mean if someone add a tracepoint callback to the irq disable >tracepoint, and did a lockdep assert to make sure interrupts are >disabled? Yes that's what I meant. Thanks, - Joel -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 10:34:10 AM EDT, Steven Rostedt wrote: >On Tue, 07 Aug 2018 10:10:59 -0400 >Joel Fernandes wrote: > >> On August 7, 2018 9:49:54 AM EDT, Steven Rostedt > wrote: >> >On Tue, 7 Aug 2018 06:33:35 -0700 >> >Joel Fernandes wrote: >> > >> >> Thanks, also one more thing I noticed in your patch, >> >> lockdep_hardirqs_off needs to be called before all other probes >but >> >> you're calling it after. This is why I registered it with INT_MAX: >> >> >> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, >INT_MAX); >> >> >> >> Without it you may get annotation warnings. Thanks, >> > >> >Interesting. I was following the old way where we called the tracing >> >code before calling the lockdep code (all hard coded and not from >> >trace events). Is this have something to do with calling the code >from >> >a tracepoint? >> > >> >Do you have an example that could trigger the warning? >> > >> >> I remember the warnings but can't remember now how I triggered them. >> I think I saw them with the irqsoff tracer or irq trace events >> running, with lockdep turned on. > >I'll see if I can trigger it. I'll run this all through my tests. Ok. >> Also an irq disable probe that does a lockdep assert that irqs are >> disabled could trigger it? >> > >You mean if someone add a tracepoint callback to the irq disable >tracepoint, and did a lockdep assert to make sure interrupts are >disabled? Yes that's what I meant. Thanks, - Joel -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 10:10:59 -0400 Joel Fernandes wrote: > On August 7, 2018 9:49:54 AM EDT, Steven Rostedt wrote: > >On Tue, 7 Aug 2018 06:33:35 -0700 > >Joel Fernandes wrote: > > > >> Thanks, also one more thing I noticed in your patch, > >> lockdep_hardirqs_off needs to be called before all other probes but > >> you're calling it after. This is why I registered it with INT_MAX: > >> > >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); > >> > >> Without it you may get annotation warnings. Thanks, > > > >Interesting. I was following the old way where we called the tracing > >code before calling the lockdep code (all hard coded and not from > >trace events). Is this have something to do with calling the code from > >a tracepoint? > > > >Do you have an example that could trigger the warning? > > > > I remember the warnings but can't remember now how I triggered them. > I think I saw them with the irqsoff tracer or irq trace events > running, with lockdep turned on. I'll see if I can trigger it. I'll run this all through my tests. > > Also an irq disable probe that does a lockdep assert that irqs are > disabled could trigger it? > You mean if someone add a tracepoint callback to the irq disable tracepoint, and did a lockdep assert to make sure interrupts are disabled? -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 07 Aug 2018 10:10:59 -0400 Joel Fernandes wrote: > On August 7, 2018 9:49:54 AM EDT, Steven Rostedt wrote: > >On Tue, 7 Aug 2018 06:33:35 -0700 > >Joel Fernandes wrote: > > > >> Thanks, also one more thing I noticed in your patch, > >> lockdep_hardirqs_off needs to be called before all other probes but > >> you're calling it after. This is why I registered it with INT_MAX: > >> > >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); > >> > >> Without it you may get annotation warnings. Thanks, > > > >Interesting. I was following the old way where we called the tracing > >code before calling the lockdep code (all hard coded and not from > >trace events). Is this have something to do with calling the code from > >a tracepoint? > > > >Do you have an example that could trigger the warning? > > > > I remember the warnings but can't remember now how I triggered them. > I think I saw them with the irqsoff tracer or irq trace events > running, with lockdep turned on. I'll see if I can trigger it. I'll run this all through my tests. > > Also an irq disable probe that does a lockdep assert that irqs are > disabled could trigger it? > You mean if someone add a tracepoint callback to the irq disable tracepoint, and did a lockdep assert to make sure interrupts are disabled? -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 9:49:54 AM EDT, Steven Rostedt wrote: >On Tue, 7 Aug 2018 06:33:35 -0700 >Joel Fernandes wrote: > >> Thanks, also one more thing I noticed in your patch, >> lockdep_hardirqs_off needs to be called before all other probes but >> you're calling it after. This is why I registered it with INT_MAX: >> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); >> >> Without it you may get annotation warnings. Thanks, > >Interesting. I was following the old way where we called the tracing >code before calling the lockdep code (all hard coded and not from >trace events). Is this have something to do with calling the code from >a tracepoint? > >Do you have an example that could trigger the warning? > I remember the warnings but can't remember now how I triggered them. I think I saw them with the irqsoff tracer or irq trace events running, with lockdep turned on. Also an irq disable probe that does a lockdep assert that irqs are disabled could trigger it? thanks, - Joel >-- Steve -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On August 7, 2018 9:49:54 AM EDT, Steven Rostedt wrote: >On Tue, 7 Aug 2018 06:33:35 -0700 >Joel Fernandes wrote: > >> Thanks, also one more thing I noticed in your patch, >> lockdep_hardirqs_off needs to be called before all other probes but >> you're calling it after. This is why I registered it with INT_MAX: >> >> register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); >> >> Without it you may get annotation warnings. Thanks, > >Interesting. I was following the old way where we called the tracing >code before calling the lockdep code (all hard coded and not from >trace events). Is this have something to do with calling the code from >a tracepoint? > >Do you have an example that could trigger the warning? > I remember the warnings but can't remember now how I triggered them. I think I saw them with the irqsoff tracer or irq trace events running, with lockdep turned on. Also an irq disable probe that does a lockdep assert that irqs are disabled could trigger it? thanks, - Joel >-- Steve -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 06:33:35 -0700 Joel Fernandes wrote: > Thanks, also one more thing I noticed in your patch, > lockdep_hardirqs_off needs to be called before all other probes but > you're calling it after. This is why I registered it with INT_MAX: > > register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); > > Without it you may get annotation warnings. Thanks, Interesting. I was following the old way where we called the tracing code before calling the lockdep code (all hard coded and not from trace events). Is this have something to do with calling the code from a tracepoint? Do you have an example that could trigger the warning? -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Tue, 7 Aug 2018 06:33:35 -0700 Joel Fernandes wrote: > Thanks, also one more thing I noticed in your patch, > lockdep_hardirqs_off needs to be called before all other probes but > you're calling it after. This is why I registered it with INT_MAX: > > register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); > > Without it you may get annotation warnings. Thanks, Interesting. I was following the old way where we called the tracing code before calling the lockdep code (all hard coded and not from trace events). Is this have something to do with calling the code from a tracepoint? Do you have an example that could trigger the warning? -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Mon, Aug 6, 2018 at 6:43 PM, Steven Rostedt wrote: > On Mon, 6 Aug 2018 17:43:19 -0700 > Joel Fernandes wrote: > >> On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt wrote: >> > >> > With this patch applied, I'm constantly getting lockdep errors. Instead >> > of doing a full revert of the patch, I did this, which makes all those >> > errors go away. I may apply this for now, and we can revisit having >> > lockdep use the tracepoint code. But since it's currently always >> > enabled, I'm thinking of just leaving this as is. The macros are still >> > clean from Joel's patch. >> > >> > Thoughts? >> >> I like your patch. Thanks a lot for doing this.. It keeps most of the >> benefits of my patch while avoiding the issues with lockdep. I agree >> we can look at the remaining lockdep issue after. There were several >> lockdep issues with this patch that I fixed over the the months, but >> there's still the one that Masami reported that I believe you're also >> seeing. Once I'm back I'll work on figuring that one out. >> >> Could you pull in the fixes to the other issues I posted though? With >> that we should be good. >> https://lore.kernel.org/patchwork/patch/971104/ >> https://lore.kernel.org/patchwork/patch/971829/ >> > > I already had these applied when I created this patch ;-) > > Thanks, I'll add it. Thanks, also one more thing I noticed in your patch, lockdep_hardirqs_off needs to be called before all other probes but you're calling it after. This is why I registered it with INT_MAX: register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); Without it you may get annotation warnings. Thanks, - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Mon, Aug 6, 2018 at 6:43 PM, Steven Rostedt wrote: > On Mon, 6 Aug 2018 17:43:19 -0700 > Joel Fernandes wrote: > >> On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt wrote: >> > >> > With this patch applied, I'm constantly getting lockdep errors. Instead >> > of doing a full revert of the patch, I did this, which makes all those >> > errors go away. I may apply this for now, and we can revisit having >> > lockdep use the tracepoint code. But since it's currently always >> > enabled, I'm thinking of just leaving this as is. The macros are still >> > clean from Joel's patch. >> > >> > Thoughts? >> >> I like your patch. Thanks a lot for doing this.. It keeps most of the >> benefits of my patch while avoiding the issues with lockdep. I agree >> we can look at the remaining lockdep issue after. There were several >> lockdep issues with this patch that I fixed over the the months, but >> there's still the one that Masami reported that I believe you're also >> seeing. Once I'm back I'll work on figuring that one out. >> >> Could you pull in the fixes to the other issues I posted though? With >> that we should be good. >> https://lore.kernel.org/patchwork/patch/971104/ >> https://lore.kernel.org/patchwork/patch/971829/ >> > > I already had these applied when I created this patch ;-) > > Thanks, I'll add it. Thanks, also one more thing I noticed in your patch, lockdep_hardirqs_off needs to be called before all other probes but you're calling it after. This is why I registered it with INT_MAX: register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); Without it you may get annotation warnings. Thanks, - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Mon, 6 Aug 2018 17:43:19 -0700 Joel Fernandes wrote: > On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt wrote: > > > > With this patch applied, I'm constantly getting lockdep errors. Instead > > of doing a full revert of the patch, I did this, which makes all those > > errors go away. I may apply this for now, and we can revisit having > > lockdep use the tracepoint code. But since it's currently always > > enabled, I'm thinking of just leaving this as is. The macros are still > > clean from Joel's patch. > > > > Thoughts? > > I like your patch. Thanks a lot for doing this.. It keeps most of the > benefits of my patch while avoiding the issues with lockdep. I agree > we can look at the remaining lockdep issue after. There were several > lockdep issues with this patch that I fixed over the the months, but > there's still the one that Masami reported that I believe you're also > seeing. Once I'm back I'll work on figuring that one out. > > Could you pull in the fixes to the other issues I posted though? With > that we should be good. > https://lore.kernel.org/patchwork/patch/971104/ > https://lore.kernel.org/patchwork/patch/971829/ > I already had these applied when I created this patch ;-) Thanks, I'll add it. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Mon, 6 Aug 2018 17:43:19 -0700 Joel Fernandes wrote: > On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt wrote: > > > > With this patch applied, I'm constantly getting lockdep errors. Instead > > of doing a full revert of the patch, I did this, which makes all those > > errors go away. I may apply this for now, and we can revisit having > > lockdep use the tracepoint code. But since it's currently always > > enabled, I'm thinking of just leaving this as is. The macros are still > > clean from Joel's patch. > > > > Thoughts? > > I like your patch. Thanks a lot for doing this.. It keeps most of the > benefits of my patch while avoiding the issues with lockdep. I agree > we can look at the remaining lockdep issue after. There were several > lockdep issues with this patch that I fixed over the the months, but > there's still the one that Masami reported that I believe you're also > seeing. Once I'm back I'll work on figuring that one out. > > Could you pull in the fixes to the other issues I posted though? With > that we should be good. > https://lore.kernel.org/patchwork/patch/971104/ > https://lore.kernel.org/patchwork/patch/971829/ > I already had these applied when I created this patch ;-) Thanks, I'll add it. -- Steve
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt wrote: > > With this patch applied, I'm constantly getting lockdep errors. Instead > of doing a full revert of the patch, I did this, which makes all those > errors go away. I may apply this for now, and we can revisit having > lockdep use the tracepoint code. But since it's currently always > enabled, I'm thinking of just leaving this as is. The macros are still > clean from Joel's patch. > > Thoughts? I like your patch. Thanks a lot for doing this.. It keeps most of the benefits of my patch while avoiding the issues with lockdep. I agree we can look at the remaining lockdep issue after. There were several lockdep issues with this patch that I fixed over the the months, but there's still the one that Masami reported that I believe you're also seeing. Once I'm back I'll work on figuring that one out. Could you pull in the fixes to the other issues I posted though? With that we should be good. https://lore.kernel.org/patchwork/patch/971104/ https://lore.kernel.org/patchwork/patch/971829/ thanks, - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
On Mon, Aug 6, 2018 at 12:50 PM, Steven Rostedt wrote: > > With this patch applied, I'm constantly getting lockdep errors. Instead > of doing a full revert of the patch, I did this, which makes all those > errors go away. I may apply this for now, and we can revisit having > lockdep use the tracepoint code. But since it's currently always > enabled, I'm thinking of just leaving this as is. The macros are still > clean from Joel's patch. > > Thoughts? I like your patch. Thanks a lot for doing this.. It keeps most of the benefits of my patch while avoiding the issues with lockdep. I agree we can look at the remaining lockdep issue after. There were several lockdep issues with this patch that I fixed over the the months, but there's still the one that Masami reported that I believe you're also seeing. Once I'm back I'll work on figuring that one out. Could you pull in the fixes to the other issues I posted though? With that we should be good. https://lore.kernel.org/patchwork/patch/971104/ https://lore.kernel.org/patchwork/patch/971829/ thanks, - Joel
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
With this patch applied, I'm constantly getting lockdep errors. Instead of doing a full revert of the patch, I did this, which makes all those errors go away. I may apply this for now, and we can revisit having lockdep use the tracepoint code. But since it's currently always enabled, I'm thinking of just leaving this as is. The macros are still clean from Joel's patch. Thoughts? -- Steve diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h index 50edb9cbbd26..a93476c6c954 100644 --- a/include/linux/irqflags.h +++ b/include/linux/irqflags.h @@ -19,6 +19,8 @@ #ifdef CONFIG_PROVE_LOCKING extern void trace_softirqs_on(unsigned long ip); extern void trace_softirqs_off(unsigned long ip); + extern void lockdep_hardirqs_on(unsigned long ip); + extern void lockdep_hardirqs_off(unsigned long ip); #else # define trace_softirqs_on(ip) do { } while (0) # define trace_softirqs_off(ip)do { } while (0) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index a8113357ceeb..30d0eb857f2e 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -267,7 +267,6 @@ struct held_lock { * Initialization, self-test and debugging-output methods: */ extern void lockdep_init(void); -extern void lockdep_init_early(void); extern void lockdep_reset(void); extern void lockdep_reset_lock(struct lockdep_map *lock); extern void lockdep_free_key_range(void *start, unsigned long size); @@ -408,7 +407,6 @@ static inline void lockdep_on(void) # define lock_set_class(l, n, k, s, i) do { } while (0) # define lock_set_subclass(l, s, i)do { } while (0) # define lockdep_init()do { } while (0) -# define lockdep_init_early() do { } while (0) # define lockdep_init_map(lock, name, key, sub) \ do { (void)(name); (void)(key); } while (0) # define lockdep_set_class(lock, key) do { (void)(key); } while (0) diff --git a/init/main.c b/init/main.c index 44fe43be84c1..5d42e577643a 100644 --- a/init/main.c +++ b/init/main.c @@ -649,8 +649,6 @@ asmlinkage __visible void __init start_kernel(void) call_function_init(); WARN(!irqs_disabled(), "Interrupts were enabled early\n"); - lockdep_init_early(); - early_boot_irqs_disabled = false; local_irq_enable(); diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 03bfaeb9f4e6..e406c5fdb41e 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -2840,8 +2840,7 @@ static void __trace_hardirqs_on_caller(unsigned long ip) debug_atomic_inc(hardirqs_on_events); } -static void lockdep_hardirqs_on(void *none, unsigned long ignore, - unsigned long ip) +void lockdep_hardirqs_on(unsigned long ip) { if (unlikely(!debug_locks || current->lockdep_recursion)) return; @@ -2885,8 +2884,7 @@ static void lockdep_hardirqs_on(void *none, unsigned long ignore, /* * Hardirqs were disabled: */ -static void lockdep_hardirqs_off(void *none, unsigned long ignore, -unsigned long ip) +void lockdep_hardirqs_off(unsigned long ip) { struct task_struct *curr = current; @@ -4315,14 +4313,6 @@ void lockdep_reset_lock(struct lockdep_map *lock) raw_local_irq_restore(flags); } -void __init lockdep_init_early(void) -{ -#ifdef CONFIG_PROVE_LOCKING - register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); - register_trace_prio_irq_enable(lockdep_hardirqs_on, NULL, INT_MIN); -#endif -} - void __init lockdep_init(void) { printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar\n"); diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c index e76b78bf258e..8326adcab0dc 100644 --- a/kernel/trace/trace_preemptirq.c +++ b/kernel/trace/trace_preemptirq.c @@ -20,40 +20,52 @@ static DEFINE_PER_CPU(int, tracing_irq_cpu); void trace_hardirqs_on(void) { if (!this_cpu_read(tracing_irq_cpu)) - return; + goto out; trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); this_cpu_write(tracing_irq_cpu, 0); + + out: + lockdep_hardirqs_on(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_on); void trace_hardirqs_off(void) { if (this_cpu_read(tracing_irq_cpu)) - return; + goto out; this_cpu_write(tracing_irq_cpu, 1); trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); + + out: + lockdep_hardirqs_off(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_off); __visible void trace_hardirqs_on_caller(unsigned long caller_addr) { if (!this_cpu_read(tracing_irq_cpu)) - return; + goto out; trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr); this_cpu_write(tracing_irq_cpu, 0); + + out: + lockdep_hardirqs_on(CALLER_ADDR0); }
Re: [PATCH v12 3/3] tracing: Centralize preemptirq tracepoints and unify their usage
With this patch applied, I'm constantly getting lockdep errors. Instead of doing a full revert of the patch, I did this, which makes all those errors go away. I may apply this for now, and we can revisit having lockdep use the tracepoint code. But since it's currently always enabled, I'm thinking of just leaving this as is. The macros are still clean from Joel's patch. Thoughts? -- Steve diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h index 50edb9cbbd26..a93476c6c954 100644 --- a/include/linux/irqflags.h +++ b/include/linux/irqflags.h @@ -19,6 +19,8 @@ #ifdef CONFIG_PROVE_LOCKING extern void trace_softirqs_on(unsigned long ip); extern void trace_softirqs_off(unsigned long ip); + extern void lockdep_hardirqs_on(unsigned long ip); + extern void lockdep_hardirqs_off(unsigned long ip); #else # define trace_softirqs_on(ip) do { } while (0) # define trace_softirqs_off(ip)do { } while (0) diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index a8113357ceeb..30d0eb857f2e 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -267,7 +267,6 @@ struct held_lock { * Initialization, self-test and debugging-output methods: */ extern void lockdep_init(void); -extern void lockdep_init_early(void); extern void lockdep_reset(void); extern void lockdep_reset_lock(struct lockdep_map *lock); extern void lockdep_free_key_range(void *start, unsigned long size); @@ -408,7 +407,6 @@ static inline void lockdep_on(void) # define lock_set_class(l, n, k, s, i) do { } while (0) # define lock_set_subclass(l, s, i)do { } while (0) # define lockdep_init()do { } while (0) -# define lockdep_init_early() do { } while (0) # define lockdep_init_map(lock, name, key, sub) \ do { (void)(name); (void)(key); } while (0) # define lockdep_set_class(lock, key) do { (void)(key); } while (0) diff --git a/init/main.c b/init/main.c index 44fe43be84c1..5d42e577643a 100644 --- a/init/main.c +++ b/init/main.c @@ -649,8 +649,6 @@ asmlinkage __visible void __init start_kernel(void) call_function_init(); WARN(!irqs_disabled(), "Interrupts were enabled early\n"); - lockdep_init_early(); - early_boot_irqs_disabled = false; local_irq_enable(); diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index 03bfaeb9f4e6..e406c5fdb41e 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -2840,8 +2840,7 @@ static void __trace_hardirqs_on_caller(unsigned long ip) debug_atomic_inc(hardirqs_on_events); } -static void lockdep_hardirqs_on(void *none, unsigned long ignore, - unsigned long ip) +void lockdep_hardirqs_on(unsigned long ip) { if (unlikely(!debug_locks || current->lockdep_recursion)) return; @@ -2885,8 +2884,7 @@ static void lockdep_hardirqs_on(void *none, unsigned long ignore, /* * Hardirqs were disabled: */ -static void lockdep_hardirqs_off(void *none, unsigned long ignore, -unsigned long ip) +void lockdep_hardirqs_off(unsigned long ip) { struct task_struct *curr = current; @@ -4315,14 +4313,6 @@ void lockdep_reset_lock(struct lockdep_map *lock) raw_local_irq_restore(flags); } -void __init lockdep_init_early(void) -{ -#ifdef CONFIG_PROVE_LOCKING - register_trace_prio_irq_disable(lockdep_hardirqs_off, NULL, INT_MAX); - register_trace_prio_irq_enable(lockdep_hardirqs_on, NULL, INT_MIN); -#endif -} - void __init lockdep_init(void) { printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar\n"); diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c index e76b78bf258e..8326adcab0dc 100644 --- a/kernel/trace/trace_preemptirq.c +++ b/kernel/trace/trace_preemptirq.c @@ -20,40 +20,52 @@ static DEFINE_PER_CPU(int, tracing_irq_cpu); void trace_hardirqs_on(void) { if (!this_cpu_read(tracing_irq_cpu)) - return; + goto out; trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); this_cpu_write(tracing_irq_cpu, 0); + + out: + lockdep_hardirqs_on(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_on); void trace_hardirqs_off(void) { if (this_cpu_read(tracing_irq_cpu)) - return; + goto out; this_cpu_write(tracing_irq_cpu, 1); trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1); + + out: + lockdep_hardirqs_off(CALLER_ADDR0); } EXPORT_SYMBOL(trace_hardirqs_off); __visible void trace_hardirqs_on_caller(unsigned long caller_addr) { if (!this_cpu_read(tracing_irq_cpu)) - return; + goto out; trace_irq_enable_rcuidle(CALLER_ADDR0, caller_addr); this_cpu_write(tracing_irq_cpu, 0); + + out: + lockdep_hardirqs_on(CALLER_ADDR0); }